-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix child manifest handling #199
Conversation
This makes the Contains() method much more intelligent about which pieces of the GCR payload were matched (found) in a given promoter manifest.
Otherwise, the unit tests for Audit() will require actually fetching data from GCR.
Also make Auditor() into a method on the ServerContext to keep the args list minimal.
This way, we can examine the logged output during tests after the logging has finished.
This part of the code deals with handling child image manifests whose digests are NOT recorded in the promoter manifests [1]. Such use cases are perfectly valid because it may be the case that users only want to track the parent digest, not all of the component child digests as well. So in the event that a child image's digest is received by the auditor, it must only reject if the corresponding parent digest cannot be found in the promoter manifests. The fix involves using the improved Contains() method (which now has much higher granularity about how close of a match the GCR payload for the child image had with a promoter manifest) instead of using a very crude suffix matching algorithm. [1]: kubernetes-sigs#165
/assign |
deps = [ | ||
"//lib/dockerregistry:go_default_library", | ||
"//lib/logclient:go_default_library", | ||
"//lib/remotemanifest:go_default_library", | ||
"//lib/report:go_default_library", | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @listx. It's not actually review comment but I decided I'm gonna try to understand bazel more in the future so could you point me to the documentation maybe about what are these doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deps
is just a list of dependencies. The auditor.go
file (which is part of package audit
) now depends on these (new) local dependencies. BTW I didn't manually put this in, I just ran bazel run //:gazelle
and it does all of these Bazel imports for me automatically.
lib/audit/auditor.go
Outdated
for _, manifest := range manifests { | ||
if manifest.Contains(*gcrPayload) { | ||
if manifest.Contains(*gcrPayload)&(reg.FlagDigestMatched|reg.FlagTagMatched) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the use of the bitwise operator but I feel it can be problematic for some of the less experienced contributors. What do you think about maybe add some of the examples how it should/would behave in the comment above? Let's treat it as a place to discussion, not a suggestion yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add maybe more explanations where I declare FlagDigestMatched instead. I'd prefer adding comments in a central place (where they are declared) over where they are used.
lib/dockerregistry/inventory.go
Outdated
if gcrPayload.Digest == fqin { | ||
return true | ||
r |= FlagDigestMatched |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same comment about bitwise operator as above, let's discuss if maybe we could provide some information about how it will behave for less experienced contributors?
} | ||
} | ||
return false | ||
return r | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is VERY complex, do you think we can maybe simplify it somehow? Maybe spliting it into smaller functions would help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can split it up at least 1 more level. ACK
I think this PR is ready for merge. I'll work on adding more test cases in subsequent PRs. /cc @justinsb |
Contains() is now replaced with Match(). The function is no longer a receiver on the Manifest, but rather the GCRPubSubPayload. Also, a new flag has been introduced: "FlagTagMismatch", to better capture the range of matching results.
We now additionally capture and check against the error and alert loggers, as well as the report buffer. Also, use "iota" because otherwise the consts for IndexLogError and IndexLogAlert are also set to 0.
This information is not really useful because we already log the found (if matched) parent digest in the "TRANSACTION VERIFIED ..." message.
FYI gonna sync up with @justinsb on Monday to try to simplify this. Setting this to work-in-progres to reflect that. /wip |
lib/audit/auditor.go
Outdated
for _, manifest := range manifests { | ||
if manifest.Contains(*gcrPayload) { | ||
m := gcrPayload.Match(manifest) | ||
if m&(reg.FlagDigestMatched|reg.FlagTagMatched) != 0 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea: I do wonder if the flags would be better expressed as a struct with bool members, particularly as some of them have true for match and some true for mismatch.
But ... what if match results (bool, Info)
, where info was the bitset or struct with more info, and the bool was true if it's a match.
} | ||
} | ||
// If we can't find the source registry for this image, then reject the | ||
// Find the subproject repository responsible for the GCRPubSubPayload. This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to split out PubSub (a transport) from the idea of a "Change" we are validating. You could also then split this function into two, one that parses pubsub and one that validates a change.
// repository and manifest list. | ||
// | ||
// nolint[lll] | ||
type GcrReadingFacility struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I don't love the name, but I'm not sure I understand it well enough to propose a better one. GcrReader
?
lib/dockerregistry/inventory.go
Outdated
// FlagDigestMatched is set if the digest in the payload matches a digest in | ||
// the promoter manifest. This is ONLY matched if the path also matches. | ||
FlagDigestMatched | ||
// FlagTagMatched is ONLY matched if the digest also matches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think these rules suggest it isn't a bitset after all, hence why I suggested just returning (bool, Info)
lib/dockerregistry/inventory.go
Outdated
return r | ||
} | ||
} | ||
return r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ever non-zero?
lib/dockerregistry/inventory.go
Outdated
} | ||
// Speed up the search by skipping over registry names whose leading | ||
// characters do not match. | ||
if !strings.HasPrefix(payload.Digest, (string)(rc.Name)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're comparing Digest
against name - is that deliberate?
lib/dockerregistry/inventory.go
Outdated
return r | ||
} | ||
} | ||
return r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here - is this ever non-zero? It makes the code harder to reason about if there's the potential for results carrying over
lib/dockerregistry/inventory.go
Outdated
|
||
var r GcrPayloadMatch | ||
|
||
if !strings.Contains(payload.Digest, (string)(image.ImageName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again - digest vs image name? Needs a comment if it's right :-)
lib/dockerregistry/inventory.go
Outdated
r |= FlagPathMatched | ||
for digest, tags := range image.Dmap { | ||
fqin := ToFQIN(rc.Name, image.ImageName, digest) | ||
// The payload.Digest field actually holds the FQIN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see - and I presume that's an artifact of the GCR pubsub format? If so, let's definitely convert that notification into something that is easier to work with... even if we just rename Digest to FQIN
lib/dockerregistry/inventory.go
Outdated
return r | ||
} | ||
r |= FlagPathMatched | ||
for digest, tags := range image.Dmap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to loop? It looks like we are only interested in one key value...
Two concrete suggestions that I think aren't too big changes and could make this much clearer:
|
I think just replacing the bitset with a struct would be enough, but I'll try out your suggestion. I imagine this will just be a struct with some
This... Yes, it would make things a lot cleaner. I'll prioritize this change first. |
This is just a variable name change.
This method takes the FQIN and PQIN fields in the payload and uses them to generate much more user-friendly fields (path, digest, tag) to use internally within our codebase.
This increases code readability, because we match on smaller pieces (digest, tag) directly. It also makes the code faster because we no longer have to construct FQINs and PQINs. Also, we match more strictly for FlagPathMatched. Instead of just checking for a prefix, we match on the entire path for an exact match. This should make it more robust for cases where we have nested projects.
Since we are only interested in looking up a single digest (the one in the payload), we can remove the loop with a simple lookup.
Thinking about the The problem here is that there are 2 dimensions of matching:
It's not clear what |
@justinsb I've gone ahead with your second suggestion to rename the misnamed |
The fields have become a bit more verbose, but that's because of the extra fields we store. We can simplify how these messages are logged in the future.
This does a 1:1 replacement of bits with bools. No functional change.
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: justinsb, listx The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This version includes a cip-auditor version with fixes to logging [1] and also child digest image detection [2]. [1]: kubernetes-sigs/promo-tools#206 [2]: kubernetes-sigs/promo-tools#199
This should fix #191
The problem was that the parsing around child manifests was not done correctly. I have added a unit test to cover this case.
Putting this up now for feedback. I still have to (1) add more unit tests for Audit() and (2) add an e2e test for the bug in #191. But I would like to do those in separate PRs as this PR is already pretty big.
/cc @justinsb @thockin @dims @bartsmykla