Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate potential duplicate files (layer) in cache and possibly apply consistent layout #1982

Open
chanseokoh opened this issue Sep 12, 2019 · 0 comments

Comments

@chanseokoh
Copy link
Member

For base image layers (downloaded from registries), the layout is

<cache dir>/layers/<(1) SHA of compressed layer blob>/<(2) SHA of uncompressed layer blob>

That is, (1) is a directory and (2) is a filename. (1) is directly visible from the manifest JSON. Different registries may use different compression levels/methods (we actually considered using a different compression level for layers from docker save), so I think hypothetically it is possible that we duplicate the same (uncompressed) layer file. For example,

<cache dir>/layers/<SHA from compression level 1>/<SHA of same contents>
<cache dir>/layers/<SHA from compression level 2>/<SHA of same contents>

Now, #1957 implements caching local layers, and the layout is in reverse:

<cache dir>/local/<SHA of uncompressed layer>/<SHA of compressed layer blob>

The reason is that we need to be able to query the cache using the SHA of an uncompressed layer first.

It would be nice if we can have consistency across the board. This can potentially remove almost-identical code duplicate in #1957 by following one execution path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants