-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix c/storage destination with partial pulls #2288
Commits on Feb 13, 2024
-
dest: propagate layer index to PutBlobPartial
Signed-off-by: Giuseppe Scrivano <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 11630be - Browse repository at this point
Copy the full SHA 11630beView commit details -
storage, dest: clarify when TOCDigest is used
This update introduces an enhancement in the blob handling mechanism, specifically by separating the TOC digest from the uncompressed/compressed digest. Follow-up for: containers#1080. Signed-off-by: Giuseppe Scrivano <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6da4441 - Browse repository at this point
Copy the full SHA 6da4441View commit details -
Remove an unnecessary variable
Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ca9c3e5 - Browse repository at this point
Copy the full SHA ca9c3e5View commit details -
Don't use the same DriverWithDifferOutput more than once
ApplyDiffFromStagingDirectory Rename()s diffOutput.Target to move it into the destination layer, so the diffOutput is not reusable. That might mean that we re-pull (partially?) the same partial layer again... but the same layer used more than once in an image should be rare, or at least very small. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b6062e1 - Browse repository at this point
Copy the full SHA b6062e1View commit details -
Introduce private.PutBlobPartialOptions
This will allow us to name the more obscure parameters, and to change their names/semantics without having to update the 4 trivial implementations. Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5df5391 - Browse repository at this point
Copy the full SHA 5df5391View commit details -
Make the index option in PutBlobPartialOptions mandatory
... because it is always available, and this allows us to remove a condition. Also rename it to LayerIndex, and make the Cache option first, for consistency with other private.*Options types. Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 981360c - Browse repository at this point
Copy the full SHA 981360cView commit details -
It has no users at all; and transports should not be in the business in specifically managing that value, compression/decompression/ updates belong in transport-independent code. Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7fdc3b5 - Browse repository at this point
Copy the full SHA 7fdc3b5View commit details -
Replace ReusedBlob.TOCDigest with MatchedByTOCDigest
The caller must already have provided options.TOCDigest, so we don't really need to return a value; the UI only needs a boolean. Also, document, again, that the non-TOC digest is a mandatory field. Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1b0e6be - Browse repository at this point
Copy the full SHA 1b0e6beView commit details -
Rename indexToTocDigest to indexToTOCDigest
... to follow Go conventions a bit more closely. Also add a comment about the general trust design of layer storage/lookup/reuse mechanisms. Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d17c78f - Browse repository at this point
Copy the full SHA d17c78fView commit details -
Don't use diffOutputs in getLayerID
- It is unnecessary: We now only set diffOutputs in PutBlobPartial, and that also sets indexToTOCDigest - It is incorrect: it is indexing by tocDigest, but it is (currently) set by compressedDigest So, right now, it can only hurt things. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 83a035d - Browse repository at this point
Copy the full SHA 83a035dView commit details -
- Take TOC digests into account even if we are converting from OCI to another format - Compute the image ID based on whether we _used_ the TOC, not whether it just exists. - Also don't forget to include the config digest in the ID computation... Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 649f3a5 - Browse repository at this point
Copy the full SHA 649f3a5View commit details -
Remove lookups by TOC digest on the fallback code in commitImage
The fallback code should never be invoked at all, in principle; and it is only reachable by internal callers; we can fix them instead of having the fallback code. So, remove it, making the tocDigest field unnecessary. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e8b6ba0 - Browse repository at this point
Copy the full SHA e8b6ba0View commit details -
Remove a no-longer-used addedLayerInfo.tocDigest field
Now we have one less user of diffOutputs, making it easier to change it. Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 597dbc6 - Browse repository at this point
Copy the full SHA 597dbc6View commit details -
Make diffOutputs indexed by layer ID
Every DriverWithDifferOutput can only be used once; so we must not index it by digest. This also happens to fix indexing it by an untrusted compressed digest. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for fddf6f4 - Browse repository at this point
Copy the full SHA fddf6f4View commit details -
- Fix incorrect uncompressedDigest parameter name; it is actually the manifest-originated probably-compressed digest - Rename the function to not suggest it returns a storage.Layer.ID value - Use the same parameter order as commitLayer Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f55d2db - Browse repository at this point
Copy the full SHA f55d2dbView commit details -
- If the first layer uses a TOC, don't use a non-hex string as the layer ID for c/storage - Fix a panic on "".Hex() if we trigger the fallback "layer never seen" path Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 41119d7 - Browse repository at this point
Copy the full SHA 41119d7View commit details -
Split createNewLayer from commitLayer
... to separate the concerns a bit. Now we have the updates of indexToStorageID closer together. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 992e548 - Browse repository at this point
Copy the full SHA 992e548View commit details -
Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0097f83 - Browse repository at this point
Copy the full SHA 0097f83View commit details -
Support pulling and pushing fully-consumed partial layers
... and identify them using UncompressedDigest, not TOCDigest On pushes, also use the trusted UncompressedDigest if available instead of preferring the untrusted value when a TOC digest is present. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a3222f6 - Browse repository at this point
Copy the full SHA a3222f6View commit details -
Don't set filenames/fileSizes in PutBlobPartial
- Setting filenames to "" is clearly useless - It is unnecessary: We set diffOutputs, so filenames are not consumed in commitLayer - The data from diffOutputs can only be used once, so setting filenames to affect TryReusingBlob... doesn't help Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 30646b5 - Browse repository at this point
Copy the full SHA 30646b5View commit details -
Use a known uncompressed digest directly instead of reading it from a…
… Layer If we get the layer using LayersByUncompressedDigest, that value should always match. Using the value we have directly is trivially faster, and more importantly we don't have to worry at all about Layer.UncompressedDigest being unset in that location, making maintenance easier. Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8a74ec9 - Browse repository at this point
Copy the full SHA 8a74ec9View commit details -
Move layer queue data out of the per-layer fields
Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1f3b20c - Browse repository at this point
Copy the full SHA 1f3b20cView commit details -
Fix reuse of existing layers twice in the same image
- When we extract a layer, allow reusing it only by the DiffID, not by the compressed digest; we don't have the compressed data, and reusing by compressed digest would result (via PutLayer LayerOptions.OriginalDigest) in a layer with an compressed CompressedDigest value, but an uncompressed CompressedSize value. Reuse by DiffID is quite a bit less likely to lead to a match in TryReusingBlob, probably causing us to find the reused layer and having to extract it again. We could improve on this by recording more data; for now, let's just assume that images which reuse the same compressed layer twice are pretty rare, and prefer simpler code. - On the positive side, record the item in fileSizes, so that we actually do find the layer in TryReusing, and not happen to reuse the file purely by accident. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5567453 - Browse repository at this point
Copy the full SHA 5567453View commit details -
Hold a Layer object when extracting its contents during commit
That will allow us to read more data out of it. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for fbd8474 - Browse repository at this point
Copy the full SHA fbd8474View commit details -
Don't require a diffID if blobAdditionalLayer is set
We don't need it for anything, so shorten the scope. Should not change behavior, diffID is actually set on that path anyway. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2db6b8e - Browse repository at this point
Copy the full SHA 2db6b8eView commit details -
Document/consolidate layer identification
- Ensure layers have an ID on every path before commitLayer, and it is consistently set before making data available. - Remove possibly misleading "for completeness" comments, ensuring identification is a basic responsibility. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4b9bf31 - Browse repository at this point
Copy the full SHA 4b9bf31View commit details -
Document origins of layer data
Should not change behavior. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5b14e3d - Browse repository at this point
Copy the full SHA 5b14e3dView commit details -
Fix a mismatch in CompressedDigest/CompressedSize on layer reuse
When reusing contents from another layer, don't set - CompressedDigest = compressed digest (provided by us) - CompressedSize = uncompressed size (computed by PutLayer) because it is inconsistent. A c/storage API extension would be required to do that. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 846520d - Browse repository at this point
Copy the full SHA 846520dView commit details -
Actually make reusing layers found by TOC work
Look up the layer by TOC; and don't abort when diffID is not set. We could, instead, look up the layers only in tryReusingBlobAsPending, and record the layer metadata at that time. That would be simpler, but it would also widen a race with concurrent image pulls/deletions: the current code can find one layer (ChainID) to reuse, and when that layer is deleted, it can find some other layer (ChainID) to actually consume. The time between tryReusingBlobAsPending and createNewLayer can be fairly significant, so opening a ~deterministic race singificantly more might lead to reproducible issues. Even if anyone encountering such issues has fundamental workflow problems that should be fixed; it is our tools that would look bad first. Signed-off-by: Miloslav Trmač <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e7a8eba - Browse repository at this point
Copy the full SHA e7a8ebaView commit details