-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oci image deletion #2003
Oci image deletion #2003
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
At a first glance, this seems way too small to plausibly be a comprehensive implementation.
-
Most importantly, a single
oci:
directory can house multiple images. So just removingindex.json
is deleting much more than requested. -
Every image contains at least one blob; this is not deleting any blobs at all, AFAICS.
- Combined with the above, only blobs that are not used by other images can be deleted.
-
Typically the second
os.Remove
would just fail, because the directory is not empty. So AFAICS the proposedDeleteImage
implementation must always fail in practice.(Yes, it does not fail in the fake
refToTempOCI
repo used in tests, but that mostly exists to exercise the name lookup and metadata operations, so it is not representative.)
We prefer the tests to be working on every commit, so that bisection doesn’t have false positives. |
All right, I'm doing this out of habit (failing tests first then committing) but I'll squash all of that for next time. |
Thanks for the feedback, I'll get back to this and will ping you once I have something more substantial. |
That habit makes sense — we really only need it squashed (or just reordered?) immediately before merging. |
ce63a70
to
9d5e12f
Compare
Here's something hopefully better. Test setup is ugly but I'd like to get the logic validated before refactoring this into something cleaner. A couple of questions on this:
Thanks :-) |
9d5e12f
to
fef8879
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is a move in the right direction.
if it's the last image in the index, should it delete the index file altogether? And all sub-folders, which should all be empty at that point?
I think keeping the index and other structures around makes sense — if someone cares about the state of the directory to delete an individual image and not to delete everything, that user might plausibly want to delete all existing images and replace them with other images.
I don't think it's possible to check if there is a container running using that image, is it?
This code does not really have a concept of containers or any kind of lock / external reference counting; it’s a pure store, and it’s up to the caller not to delete things that will need to be used.
if the given image is empty
Assuming this refers to ref.image
, I would expect this to follow the existing getManifestDescriptor
behavior: If the users’ input if ambiguous, fail and don’t delete anything. If ref.image == ""
in a situation where that is documented to clearly identify one image, deleting that one image makes sense to me (though the situation is a bit suspect, why would anyone do that?).
399f4c3
to
75a2d40
Compare
Style/opinion question: index files are generated using string literals: https://github.com/containers/image/blob/main/oci/layout/oci_transport_test.go#L201 I started refactoring this part for my tests to use imgspecv1 structs that I'm filling then serializing instead; is that ok or do you rather keep literals? |
I think using raw blobs works better as an interoperability test; that way we know if the (de)serialization code changes in a way that breaks compatibility. (That said, using raw files in Refactors are fine and welcome, but please keep any larger ones as separate commits to be clear which parts of the PR are intended to do something new and which are intended not to be a change. |
Now I'm on the fence; it's true that literals have the advantage of indirectly testing the spec, which is always a plus. I'll commit what I have now so you can have a look then I'll make the changes, that's not a big deal. |
I'm fairly happy with what's in there now; test data generation is much clearer (although it uses the spec library, I'll change that to literals as soon as time permits) and allows more flexibility as demonstrated with the (failing fort now) test with a layer shared by two images. (Looks like I'll need to rebase on main too) |
1fb14c8
to
121e1f3
Compare
All right, all test are passing now :-) |
121e1f3
to
fe1650d
Compare
A couple of question on top of the review:
Thanks |
I’m afraid I didn’t get to review the full PR yet
Sure; that seems useful, and can’t hurt anything.
You know why you are doing this work and whether it is valuable for you. My intuition is that most archives are pretty small, and that the extra complexity and risk are not at all worth it, but that’s a guess with no data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not a full review, just a very quick skim.)
oci/layout/oci_transport.go
Outdated
if err != nil { | ||
return err | ||
} | ||
for _, layer := range otherImageManifest.Layers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The manifests themselves are also stored as blobs.
Config.Digest
also needs to be accounted for.- IIRC it’s possible for an entry in the top-level index to be an
imgspecv1.Index
again — at least this code does that to one level of nesting, the spec seems to suggest an arbitrary depth.
With OCI artifacts storing ~arbitrary data, it seems quite plausible for an entry in Layers
refer to the same blob as a config/manifest/index of another image/artifact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right, there can be nested indexes: https://github.com/opencontainers/image-spec/blob/main/image-index.md?plain=1#L46 - that's gonna be interesting...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtrmac but this also means that getManifestDescriptor
(https://github.com/containers/image/blob/main/oci/layout/oci_transport.go#L176) is broken in the case of nested indexes, it only looks in the index.json
(https://github.com/containers/image/blob/main/oci/layout/oci_transport.go#L240) unless I'm missing a piece of the puzzle (nested indexes are stored as blobs as far as I understand the convoluted spec)
I've started working on this by changing the tests setup (nothing can't be fixed with recursion), but supporting this means that I also need to start scanning indexes recursively. It will complexifies this PR a bit more - what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec says implementation should support this: https://github.com/opencontainers/image-spec/blob/main/image-index.md?plain=1#L44
It also means that the WIP you mentioned earlier about returning an index when calling getManifestDescriptor
which would be impossible if the image is in a nested index, or that'll be the index in the index it was found, not necessarily index.json
My head starts to hurt :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(My personal opinion is that the nested indexes provide too much flexibility and the semantics are unclear for consumers, so I’m not that much of a fan. But the feature exists.)
There’s a bit of nuance in here: if the repo contains a nested index, it’s not ideal, but also not a disaster when this code does not support that in ImageSource
/ getManifestDescriptor
; the code would just “safely” fail and users would have to use some other tool.
OTOH if the deletion code does not handle nested indices, that could lead to data loss, so that is a more urgent concern.
So one approach would be to recursively enumerate the used blobs; another, I think quite reasonable, possibility would be to just fail when a nested index is encountered, and to tell the user to use another tool — that’s analogous to the ImageSource
failure mode, but still safe.
Note that the above applies to “deeply nested” OCI indexes. One level of nesting is supported, to represent a multi-arch image. See the outcome of
skopeo --insecure-policy copy --all docker://quay.io/libpod/alpine oci:$dest
So… I haven’t actually written the code, but it seems to me that a full recursion (or possibly a “queue of blobs+blob types to scan”) support might be the easiest way to implement all of this anyway, instead of somehow special-casing the “one level of nesting” case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that handling this properly in the deletion has to be extra safe.
The good news is that I now have a way to cleanly create nested indexes for tests so this can be tested properly (which uses recursion so in for a penny, in for a pound).
Thanks for the skopeo example, I'll have a look at how this is handled, that will surely be instructive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been inspecting the result from the skopeo
command and it has been very educating :-)
One thing I noticed, reading getManifestDescriptor
, is that you can mess up your registry if you do something like this:
skopeo copy docker://quay.io/libpod/alpine:3.2 oci:/tmp/oci-registry/alpine
skopeo copy docker://quay.io/libpod/alpine:3.10.2 oci:/tmp/oci-registry/alpine
The index.json file will then contain two manifests (image manifests or sub indexes if you used the --all) which don't have any annotations (so no org.opencontainers.image.ref.name
).
At that point, that directory is toast, it seems:
skopeo copy oci:/tmp/oci-registry/alpine:3.2 oci:/tmp/oci-registry/alpine-copy
FATA[0000] initializing source oci:/tmp/oci-registry/alpine:3.2: no descriptor found for reference "3.2"
skopeo copy oci:/tmp/oci-registry/alpine oci:/tmp/oci-registry/alpine-copy
FATA[0000] initializing source oci:/tmp/oci-registry/alpine:: more than one image in oci, choose an image
Would this be worthy of being reported as a bug?
Sorry for the digression, I'm still in the learning phase :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At that point, that directory is toast, it seems:
It’s… a valid OCI structure. And the WIP oci:/tmp/oci-registry/alpine:@0
syntax will eventually allow c/image to consume it.
I agree it’s very awkward that c/image can produce a structure that it can’t read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me to that WIP with the index? It could be useful, for me, to have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fe1650d
to
d3e5d57
Compare
dff7d6f
to
e5aed42
Compare
@mtrmac I made quite a few changes:
I started working on the nested index stuff in another branch, but it's not over yet (although the fixture is there already); there is also the extra complexity that if something changes in the nested index, its sha256 will change too, so all the "root" index will have to be changed too. |
4c59a25
to
0c688f4
Compare
0c688f4
to
19682ec
Compare
5a9d2f8
to
62296c6
Compare
@mtrmac I finally had time to take a second shot a this. Going back and forth with tests, I ended up with a different design. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m sorry about the late review.
This is great work, just two outstanding concerns:
- I’m quite worried about the code to finally delete the index entry getting out of sync
- The handling of shared blob directories can be simplified.
a9bf683
to
2807bce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
- The shared blobs dir conversation continues in Oci image deletion #2003 (comment) , I think I’m converging much closer to your point
- A bit more for the top-level index deletion, please
- AFAICS the updated test for
getManifestDescriptor
is incorrect in a rather misleading way.
3ae8eda
to
0e68a62
Compare
Again, sorry about that :-( |
No worries — I have written a much worse bug yesterday. I just wanted to highlight that as more important than some of the stylistic comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM.
Could you squash this to one or two commits, please?
Signed-off-by: Philippe Vlérick <[email protected]>
2c1a15c
to
d486044
Compare
Rebased in one commit (and on the latest main) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much!
Fix attempt for #1812
I made two commit; first a failing test then the feature to make it work, I'll happily fixup in one commit given how small this change is.