Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add copy option to strip sparse manifests #1707

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

bertbaron
Copy link

Not all registries accept manifests where some images are missing. Therefore an option is added to strip missing images from the manifest when only a subset of images is copied.

This is a first change toward adding a filter on architecture in skopey copy and sync. See containers/skopeo#1694.

I'd be happy to add documentation and tests when this change is acceptable.

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Sadly manifest.List is a public API and adding a method to it could break external callers which implement this interface, so we can’t just add a method.

This is why we have been moving towards making most interfaces private instead — compare image vs. internal/image. So adding this method would first require introducing something like internal/manifest.List, moving at least the code that refer to the internal interface to internal/manifest (and I’m not sure about the two actual implementations), while preserving the current external API in a way that minimizes duplication.

manifest/list.go Outdated
@@ -33,6 +33,10 @@ type List interface {
// must be specified.
UpdateInstances([]ListUpdate) error

// RemoveInstance removes the instance with the given digest from the list. If no such instance
// exists the list is left unchanged.
RemoveInstance(digest.Digest)
Copy link
Collaborator

@mtrmac mtrmac Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably return an error on a missing digest (but because it will start as an internal-only interface, it’s not critical to get right).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid my knowledge of go and this project make it difficult for me to perform such a refactoring. So instead I moved the functionality to a private method using a type switch. That way it has no impact on the public interface and can it safely be refactored in a future PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably return an error on a missing digest (but because it will start as an internal-only interface, it’s not critical to get right).

I personally prefer the more idempotent type of behavior (why return an error if the end result is what you want it to be), but I'm fine of course to implement it differently. Since this is internal now indeed I left it as is.

copy/copy.go Outdated
@@ -467,6 +487,7 @@ func (c *copier) copyMultipleImages(ctx context.Context, policyContext *signatur
}
c.Printf("Copying %d of %d images in list\n", imagesToCopy, len(instanceDigests))
updates := make([]manifest.ListUpdate, len(instanceDigests))
var skipped []digest.Digest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identifying instances by digest is fine right now.

For the per-arch filtering anticipated in #1694, I wonder if that is sufficient; it could, in principle, be the case that two different instances (most likely the same basic platforms, but different variants) point at the same instance digest — and in that case we might end up wanting to remove one pointer to that digest but not the other one. In that case we might need to identify instances by the index, not by the digest (and remove them in the right order).

I suppose that’s not a concern for this PR, and a future PR that adds the per-arch filter could deal with that… but it might be simpler to change the design right now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I had the same concern. But this could also affect List.Instance(digest.Digest). Its because of that method that I implemented it this way.
As an alternative I could remove it by passing a slice of integer indices or something like that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Images are removed by index now

copy/copy.go Outdated
@@ -169,6 +185,10 @@ type Options struct {
// Download layer contents with "nondistributable" media types ("foreign" layers) and translate the layer media type
// to not indicate "nondistributable".
DownloadForeignLayers bool

// When only a subset of images of a list is copied, this action indicates if the manifest should be kept or stript.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not often that someone sneaks a new word past me, but you did with "stript". It's fine, but I wonder if more people would understand "stripped" even though they're equivalent....

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. It wasn't intentional and I used 'stripped' elsewhere. But it's good to know that others may learn from my lack of knowledge of the English language.

@TomSweeneyRedHat
Copy link
Member

Needs a rebase @bertbaron

@bertbaron
Copy link
Author

Is there a reason why this PR still can not be merged?
I did want to add some tests, but since I couldn't find existing tests for the copy function that could serve as examples this would require more effort than anticipated and I didn't find the time yet.

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m sorry, I’m afraid I’m mostly just busy with other work.

@@ -390,6 +410,33 @@ func compareImageDestinationManifestEqual(ctx context.Context, options *Options,
return true, destManifest, destManifestType, destManifestDigest, nil
}

// removeInstanceFromList removes the given image from the list
// this should probably move to an internal API
func removeInstancesFromList(manifestList manifest.List, imageIndices map[int]bool) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; let’s build the infrastructure the right way rather than piling on tech debt.

It will never be any easier to get that done, and each such “small temporary thing” makes it harder.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your way of thinking, but its just that I'm a bit lost at how to do it. I can't move the public interface of course to not break it. I can extend it (embed it in an internal one). But I also can't move the implementation structs I guess because those also seem to be public. I can extend those with the internal interface, but then the methods would still be public on the struct and I'm not sure if that's ok. And is it possible in go to do something like return type overloading to get methods like Clone() right?
With some pointers I might be able to perform the refactoring but at this point I'm a bit stuck.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the hard part of all this work.

Unlike interfaces, adding new methods to structs is ABI-safe. So my first guess is that the manifest.List interface, and the methods that create that (ListFromBlob) should be ~duplicated in internal packages. The implementations would probably also be moved to internal-only, with public aliases (type OCI1Index = internal.OCI1Index), but I’m much less certain about this part.


Alternatively… I think we’ll need to make these refactors for other purposes (adding/removing zstd-compressed variants) fairly soon anyway (but I’m not exactly sure when). So it might be an option to wait with this PR until that infrastructure work is done by other people.

Cc: @vrothberg .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's helpful, I'll give it a try.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried but didn't succeed. Apparently go requires return types of a method to be equal in order to implement an interface, a subtype is not supported (I think its the lack of covariance support as described here: golang/go#30602). So the Clone method for example has to return a public List and not an internal one. But it also doesn't allow a more specific ListUpdate to be returned for example.
One option I considered was to add methods like CloneInternal() to the private interface. But that seems to be impossible without introducing package cycles because the public interface and public creator method are in the same package. The only workaround/hack I can think of is to do some dynamic registration to redirect the public creator to the private creator.
On the other hand, while I learned more about go with this exercise there's still a lot more for me to learn so I might have overlooked some other options. So unless someone can give me a clear pointer to something I missed I think I'll give up for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature A request for, or a PR adding, new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support stripping dropped instances from manifests when making sparse multi-platform copies
3 participants