-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add oci-archive-uncompressed-fd:5 #1209
Comments
Prior work in containers/podman#10075 that exposed a raw API for this, but it would be nice for c/i to handle things like converting to OCI format and manifest lists, etc. |
Here's a completely different idea: Maybe tools that want to do this type of stuff could expose an oci distribution endpoint, then The control flow here is funky because it'd be: app ➡️ skopeo ➡️ app, but it seems manageable. Ideally we avoid the need for local TCP and can pass down a pipe or socketpair, so this would then look like |
(Note that there already is a native OSTree transport. It might well not fit your needs as is — but using a native c/image transport should be considered as a design option, instead of using an intermediate format. OTOH that might longer-term mean maintaining it inside the c/image repo — right now transports are fully pluggable via At a first glance, it should be quite possible to implement the OCI archive creation as a stream producer without an intermediate on-disk stage; we do that for Moving to a streaming model would be somewhat disruptive to the
Just to make sure this is considered, note that a buffering step needs to happen at least once in the current model, because the copy pipeline (and therefore the data going over a pipe) always sends the layer blobs first, and the manifests and other metadata later — while most consumers need to consume data in the opposite order. For very limited cases it might be possible to kludge around this (e.g. heuristically decide whether an incoming blob is a layer / config , and stream the layers directly to the destination in some representation that doesn’t require knowing the parent/child layer relationships), but making this work reliably and efficiently, e.g. multi-arch images, might be rather tricky. Of course one unavoidable buffering step is not a reason not to try to avoid a second, probably entirely avoidable, buffering step. (It’s not also obvious to me that a special one-off “multiplex data over a pipe” transport is easier to implement/maintain/debug than running a temporary HTTP server on localhost speaking the docker/distribution protocol, but we can’t know for sure before we do that work…) |
Right, it's almost the inverse of what I want though, that transport is trying to store a container image in the ostree data store, whereas I want to encapsulate an ostree commit into a container image, and have something else be able to unpack it. I think we should remove the ostree c/storage backend ultimately; last I heard that was blocked on some people using it still due to the deduplication, but I think it's not the right long term approach. More broadly too, having c/image write to c/storage puts the process combining those two (e.g. skopeo) in a position of total privilege over the host filesystem, whereas I want to move more towards privilege separation. Admittedly, ostree today exactly combines fetching with writing, but I'm trying to avoid replicating that mistake here. The other argument for avoiding c/storage here is that what we're storing is not a container, it's the host OS; and I don't want to scope in host OS management into c/storage (or c/image). The easiest way to understand this is to forget ostree exists for a second and imagine that we wanted to encapsulate a set of rpms/debs/etc in a container image to pass to
Ah, I hadn't realized that. Will take a look at that code. Hmm, well I'd wanted my code to just deal with OCI, but eh if I can just use
Why is that? Why not have a copy send the manifest first? |
Major downsides of
In a general case, the copy may have to modify the layers (to reuse already-present differently-compressed variants, or to compress/decompress/recompress/encrypt), i.e. the manifest contents are unknown until the layers exist; and the manifest might need format conversion (which is determined by trial and error to see what the registry doesn’t reject, and that can again only happen if the layers already exist on the registry). Yes, there are cases where a byte-for-byte copy is exactly what is desired, but a generally usable transport must be ready to accept layers before metadata in order to support the more general copies. (If it matters a lot, with |
OK cool! With the named pipe hack and And looking at this, I think we don't want So retitling this and here's the desired improvements:
|
Conceptually, the goal here isn't a "copy" - it's using container images as a wrapper. That said, it would be nice to support preserving the wrapper data eventually but it's definitely not critical path. But maybe we just want backends to expose a property |
I think in the end this will be a skopeo API probably. |
See ostreedev/ostree-rs-ext#15
Right now it looks like
oci-archive://
creates a temporary oci directory, and then re-tars it up, which is really inefficient.In order to use containers/image to write to something that's not containers/storage (in my case ostree, but there are people putting raw disk images in container images for host updates too, etc.) it'd be nice to have support for streaming writes to e.g. a pipe.
Probably while we're here we should add
oci-archive-fd://5
to output to file descriptor 5 instead of going viaoci-archive:///proc/self/fd/5
.Or maybe going farther, add
oci-archive-streaming-fd://5
which explicitly requires the caller to verify any layer digests, etc.The text was updated successfully, but these errors were encountered: