-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add FindManifests #828
add FindManifests #828
Conversation
Note: this is currently in Comments from the original discussion: @imjasonh wrote:
@jonjohnsonjr wrote:
@deitch wrote (about
@jonjohnsonjr wrote:
|
What parts should be in here? @jonjohnsonjr wrote: I imagine we'll start with something like these
And end up with something like this, not sure where:
We may want to just skip to the generic bits and do some type-checking (like in remote.MultiWrite) to parse the descriptors appropriately. Maybe we should rename FindManifests to FilterManifests? |
So @jonjohnsonjr where does this leave us? What do we want in this PR? Let's ignore the "where" (which package) for now, until the rest of this is ready. Then we can move it all over in one commit change. The API is what we need to work out.
I think I view the recursive functionality as potentially another layer; we can and should do it, but it can be a separate PR (since it requires |
I think "where" and the API are somewhat linked. If you are okay with letting this languish for a bit while we experiment and discuss it to death, then I agree we should figure out the API first and then decide on where it should go, but if you're looking to get this merged soon-ish, it might make sense to just put it in I have a really hard time judging an API before I actually try to use it. Assuming I could find the time, I'd want to attempt to refactor various parts of go-containerregistry to use the I'll throw out some stuff that has jumped to my mind while chewing on this idea, though: implementation details vs external consumersOne reason I keep coming back to On the other hand, we're also exposing an API for consumers of this library (e.g. you). As I described in the README (and mutate README, I more or less expect consumers to stick to methods on the bridging the descriptor graphSo the current API (returning Let's consider the As a motivating example, let's say I want to compare all the s390x images in this layout. How do we do this? Ignoring errors... func findManifests(root v1.ImageIndex, p v1.Platform) []v1.Image {
matches := []v1.Image{}
m, _ := root.IndexManifest()
for _, desc := range m.Manifests {
idx, _ := root.ImageIndex(desc.Digest)
for _, desc := range partial.FindManifests(idx, match.Platform(p)) {
img, _ := idx.Image(desc.Digest)
matches = append(matches, img)
}
}
return matches
} Since I want access to the image layers, config, etc., I need to convert the func findImages(root v1.ImageIndex, p v1.Platform) []v1.Image {
matches := []v1.Image{}
m, _ := root.IndexManifest()
for _, desc := range m.Manifests {
idx, _ := root.ImageIndex(desc.Digest)
imgs := partial.FindImages(idx, match.Platform(p))
matches = append(matches, imgs...)
}
return matches
} Alternatively, we could do something like func findImages(root v1.ImageIndex, p v1.Platform) []v1.Image {
matches := []v1.Image{}
m, _ := root.IndexManifest()
for _, desc := range m.Manifests {
idx, _ := root.ImageIndex(desc.Digest)
for _, desc := range partial.FindManifests(idx, match.Platform(p)) {
matches = append(matches, desc.Image())
}
}
return matches
} That improves on things a lot. If we do something like this, implementing a recursive version makes most of this code go away: func walkImages(root v1.ImageIndex, p v1.Platform) []v1.Image {
matches := []v1.Image{}
for _, desc := range partial.WalkManifests(root, match.Platform(p)) {
matches = append(matches, desc.Image())
}
return matches
} I think the crux of the issue is that our strawmanA rough stab at something that would make sense as a result of FindFoo: // TODO: Name it, embed into remote.Descriptor, use anywhere there's ambiguity.
type v1.Either interface {
Image() (v1.Image, error)
ImageIndex() (v1.ImageIndex, error)
}
// TODO: Name it something better.
type Needle struct {
// embed v1.Descriptor like remote.Descriptor does
v1.Descriptor
// Allow callers to re-interpret as an Image or ImageIndex.
v1.Either
// Optional section that we could add later.
// This seems like it would very often be very useful (root == nil).
Parent() *Needle
// Perhaps add ways to access anything that a descriptor can point to?
Layer() (v1.Layer, error)
Blob() (io.ReadCloser, error)
} This isn't an "obviously correct" abstraction to me, so I really hesitate to committing to this, but it does satisfy most of the implementation considerations that we've discussed. Please be very critical here, as I'm sure it can be improved.
I think it is fine as is if we are willing to let it live on the Island of Misfit Toys.
Let's hold off until you or I actually need it.
Definitely, but we should attempt to figure out the interface for that in this PR if we want a consistent experience. I feel like this is two separate conversations kind of intertwined: there's "what is our ideal API for this kind of thing" vs "what is reasonable/useful to merge now". |
OK, that is a pretty good (practical) argument. I will move whatever we do here into |
Heh, so do I. It probably actually is 3 or more, but I can barely keep the two in my head. I know of left-brain and right-brain, but forward-brain, mid-brain, back-brain, up-brain, down-brain, strange-brain, charmed-brain, and Higgs-brain are too much for me.
Agreed totally. I actually saw // FindManifests given a v1.ImageIndex, find the manifests that fit the matcher
func FindManifests(index v1.ImageIndex, matcher match.Matcher) ([]v1.Descriptor, error) {
// get the actual manifest list
indexManifest, err := index.IndexManifest()
if err != nil {
return nil, fmt.Errorf("unable to get raw index: %v", err)
}
manifests := []v1.Descriptor{}
// try to get the root of our image
for _, manifest := range indexManifest.Manifests {
if matcher(manifest) {
manifests = append(manifests, manifest)
}
}
return manifests, nil
}
// FindImages given a v1.ImageIndex, find the images that fit the matcher
func FindImages(index v1.ImageIndex, matcher match.Matcher) ([]v1.Image, error) {
matches := []v1.Image{}
manifests, err := FindManifests(index, matcher)
if err != nil {
return nil, err
}
for _, desc := range manifests {
img, _ := index.Image(desc.Digest)
matches = append(matches, img)
}
return matches, nil
} Actually, I will add the above to this PR-in-progress, just to see how it looks/feels. This brings us to your "wrapper", which has the signature
I like the idea that there is a uniform interface between This does become the "second conversation" you raised above.
Not really. We cannot use just I think my point is that while a generic walker can make sense, a generic matcher across those doesn't appear to make sense, because the very reason you have multiple layers of indexes (really, indices) is because each one serves a different aggregation upwards (and therefore filtering downwards) purpose. Without the context at each layer, selecting becomes challenging. |
Added |
Do we want to open a new PR in parallel to deal with that? One that has this interface |
Codecov Report
@@ Coverage Diff @@
## master #828 +/- ##
==========================================
- Coverage 74.83% 74.77% -0.07%
==========================================
Files 105 106 +1
Lines 4383 4420 +37
==========================================
+ Hits 3280 3305 +25
- Misses 620 626 +6
- Partials 483 489 +6
Continue to review full report at Codecov.
|
Just need to invent time travel so I can experiment with the consequences of merging this PR before I merge it.
Now we're cooking with fire.
Yeah this might be the practical thing to do, but let me think about how these things might compose or be interchangeable before we settle on a signature.
I'd actually be interested to see if we can express this requirement as a matcher (or something similar) in the type system. From a readability perspective, loops and stuff might be better, but I'm imagining some API where I can pass a single matcher and it's more efficient to let the implementation apply the matcher than to do multiple passes (this feels unlikely to me given how we've carefully made most of these packages do lazy access to everything, but it's something to consider).
This is a good point, and maybe where my
I don't know if this is necessarily true. You might have a "bag" of images that some downstream consumer client will know how to interpret (by doing platform resolution), but there isn't a strict structure. E.g. maybe some of your top-level
Yeah, I am really torn on this and have started writing "indexes" in this context for a couple reasons:
Also, enjoy/suffer from the inconsistency even within the spec: A younger version of me would be horrified by the concessions of pedantry I've made for pragmatism.
Yeah if you want to open a draft PR (assuming I haven't by the time you wake up) to just contain this long-winded conversation, we can then refocus this PR. |
pkg/v1/partial/index.go
Outdated
return nil, err | ||
} | ||
for _, desc := range manifests { | ||
img, _ := index.Image(desc.Digest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll want to handle this error (I know this got copied from me 😄).
What should we do if a descriptor matches but points to a non-image artifact?
- Let this error out.
- Ignore non-Image media types.
- Return an error for non-Image media types.
The second option makes me want to implement composition so we can:
FindManifests(index, match.And(match.MediaTypes(types.Images), matcher))
But this might run afoul of the user intent -- they could just supply this matcher themselves! Should we make them? Or is that the whole point of this helper function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working really hard to avoid composition here. I don't object to it later, but I am focusing on simple; easy for the user to understand use cases, not to mention easy for me to grasp as we work these through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's think of it practically. I say, "give me all of the images from this index of platform X" or "give me all of the images from here of name Y". I don't really care about the fact that 7 of the 10 manifests do not point to images, I want those 3. The very calling of FindImages
implies "it is images I want". I would ignore anything that errors because it is not an image.
As this is work-in-progress, I am going to add that handling in.
By the way, is there a types.Images
? I didn't see that. Or were you suggesting it for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working really hard to avoid composition here. I don't object to it later, but I am focusing on simple; easy for the user to understand use cases, not to mention easy for me to grasp as we work these through.
Yeah that's fine, I don't want to commit to an API either. We could just hardcode some logic in here to check for the right media type (you could use a matcher to simplify) instead of using composition.
I would ignore anything that errors because it is not an image.
I don't love this -- index.Image(hash)
could fail for all kinds of reasons. Swallowing them feels dangerous.
By the way, is there a types.Images? I didn't see that. Or were you suggesting it for the future.
Not yet, just theoretical shorthand for a group of roughly equivalent media types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this -- index.Image(hash) could fail for all kinds of reasons. Swallowing them feels dangerous.
Try this; I just updated to use a custom error and check it with errors.Is()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also used that custom error in v1/remote
and v1/layout
, and documentation on v1.ImageIndex
interface
If we could invent time travel, about the last thing I would use it for is figuring out PRs. I wouldn't quite, say, stop WWII, as tempting as that would be - the law of unintended consequences is enormous there - but I can see some smaller scale of good, as well as buying a lot of APPL shares a 20 years ago, when they nearly were gone. Do good and do well. On that note, there was a great original series Star Trek episode about exactly that, "The City on the Edge of Forever".
Sure. I don't think we need to answer that in this PR. It is one (or maybe two) past it.
I honestly think it is too complex, but I could be wrong. I am trying to build upwards, and I see that as beyond practical. But, again, happy to be proven wrong. I think that, too, is another layer or so past this. We still are on building blocks with this one.
It isn't spec-enforced true, but I think it is how it is used. I like the "footgun" analogy, though.
Heh, my daughter just walked in and asked if she used the term "panicking" correctly, or if her teacher is correct. Unfortunately, for her, the teacher was. In this case, I am more than willing to be pedantic.
I will try. One or the other of us will do it first and @ cc the other. |
1f41547
to
221d99b
Compare
So far, this PR does the following:
func FindManifests(index v1.ImageIndex, matcher match.Matcher) ([]v1.Descriptor, error)
func FindImages(index v1.ImageIndex, matcher match.Matcher) ([]v1.Image, error) The open questions remain. What do we need to add to this PR to move it ahead?
E.g. func FindImagesByPlatform(index ImageIndex, p Platform) ([]v1.Image, error) {
return FindImages(index, match.Platform(p))
}
func FindImagesByName(index ImageIndex, name string) ([]v1.Image, error) {
return FindImages(index, match.Name(name))
}
func FindIndexesByPlatform(index ImageIndex, p Platform) ([]v1.ImageIndex, error) {
return FindIndexes(index, match.Platform(p))
}
func FindIndexesByName(index ImageIndex, name string) ([]v1.ImageIndex, error) {
return FindIndexes(index, match.Name(name))
} And do we need the single versions of those? func FindManifests(index v1.ImageIndex, matcher match.Matcher) ([]v1.Descriptor, error)
func FindImages(index v1.ImageIndex, matcher match.Matcher) ([]v1.Image, error)
func FindManifest(index v1.ImageIndex, matcher match.Matcher) (v1.Descriptor, error) {
manifests, err := FindManifests(index, matcher)
if err != nil {
return nil, err
}
if len(manifests) > 0 {
return manifests[0], nil
}
return nil, nil
}
func FindImage(index v1.ImageIndex, matcher match.Matcher) (v1.Image, error) {
// etc.
}
func FindIndex(index v1.ImageIndex, matcher match.Matcher) (v1.ImageIndex, error) {
}
func FindImageByPlatform(index ImageIndex, p Platform) (v1.Image, error) {
return FindImage(index, match.Platform(p))
}
func FindImageByName(index ImageIndex, name string) (v1.Image, error) {
return FindImage(index, match.Name(name))
}
func FindIndexByPlatform(index ImageIndex, p Platform) (v1.ImageIndex, error) {
return FindIndex(index, match.Platform(p))
}
func FindIndexByName(index ImageIndex, name string) (v1.ImageIndex, error) {
return FindIndex(index, match.Name(name))
} |
This doesn't feel quite right to me.
I'd say
I don't think we'd want these given how well
I'd just leave it out for now. Given that we don't recurse and just iterate over one level, taking the 0th element of the result seems totally reasonable to me. If someone complains this is slow for their 10,000 entry index.json, we can come back to it. |
If we find someone using this for a 10,000 entry |
OK, I added in What is Unless you are thinking it is useful for when I have a |
I never argue with less work. With the present ones, we have the capabilities, so those would just be convenience. If we find ourselves or others regularly doing boilerplate, we can worry about adding them later. |
Oh, so you are saying, inside
So was I. I actually thought about having both of those functions But I will admit that just checking inside the funcs avoids the issue altogether. Updating soon. |
a453c0a
to
f06d938
Compare
OK, have another look.
This looks much cleaner, I must say |
Rebased on latest |
Yes exactly. It's not really that useful, but the type signatures line up and it follows the pattern. If I find myself really wanting
+1 👍 I'll probably refactor some stuff to use these in a few places 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than a couple nits and tests 👍
I accepted all of your changes, but it then turned it into 4 new commits in addition to the base one. We probably should squash them down. And if I could figure out how to pull it back locally, I would. |
Hah, that's fine. They're trivial changes, so I'd just force push over them probably. |
OK got it. |
What else do you think we need here @jonjohnsonjr ? |
I think just some test coverage and it'll be ready for merge.
…On Thu, Nov 19, 2020 at 9:39 AM Avi Deitcher ***@***.***> wrote:
What else do you think we need here @jonjohnsonjr
<https://github.com/jonjohnsonjr> ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#828 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEIJGZX4JZOGGMZCI7BE2GLSQVJ3NANCNFSM4TW24KIQ>
.
|
Now you're really pushing your luck! 😂 |
yeah, but it is such a pain, since you need to construct actual |
|
This line here: manifest.Manifests = append(manifest.Manifests, v1.Descriptor{
Digest: digest,
Size: size,
MediaType: mediaType,
}) I need to change it for each manifest, so I have what to match against. It needs platform or annotations, etc. I cannot quite pass it in, since that would mess up the signature (and it needs to be different for each one). I think we need a way to modify them? |
Not really, you just need any If you want to make sure it "really works", I'd do something super lazy like:
And confirm that |
I had to do some strange construction to get We have |
Signed-off-by: Avi Deitcher <[email protected]> Co-authored-by: jonjohnsonjr <[email protected]> Signed-off-by: Avi Deitcher <[email protected]>
Thinking about #835 and this, @jonjohnsonjr . Once that is in, we could have the following: func FindResolvable(index ImageIndex, matcher match.Matcher) ([]v1.ResolvableDescriptor, error) And then you end up with the four:
The last one closes it out nicely, and is the equivalent of |
Oh good, CI is green. |
Not sure about the name but I agree with idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Neither am I. Let's move this discussion over to #835 , which is where I took the name from. |
Add FindManifests utility, that allows one to pass a
v1.ImageIndex
and amatch.Matcher
, and get back thev1.Descriptor
entries in the index that match.This is a spin-out from #823 and will include the salient original comments below.
We expect to have several more such utilities that build on
Matcher
in this PR and/or others before it is done.