Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synthetic runtime mounts are again being serialized into layers #5592

Closed
cgwalters opened this issue Jun 16, 2024 · 4 comments
Closed

synthetic runtime mounts are again being serialized into layers #5592

cgwalters opened this issue Jun 16, 2024 · 4 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@cgwalters
Copy link

I was trying to write some guidance on reproducible/optimized container builds, and ran headfirst into the issue where podman/buildah inject generated internal tmpfs mount content into the tar stream, but docker doesn't is back (or maybe was never really fixed, I didn't double check at the time):

This container file is constructed such that it should result in a reproducible tar stream each time we do a build (i.e. two podman build --no-cache should result in the same diffid):

$ cat Containerfile
FROM busybox
RUN echo hello world > /test.txt && touch -r /usr /test.txt

(We could of course just run touch, but I like to demonstrate in this how one can use touch -r to canonicalize timestamps in a less trivial use case, such as after running curl or whatever)

$ rpm -q podman
podman-5.1.0-1.fc40.aarch64
$ podman build  -t localhost/test:v0 -f Containerfile --no-cache .
...
$ podman build  -t localhost/test:v1 -f Containerfile --no-cache .
...
$ podman image diff localhost/test:v0 localhost/test:v1
C /etc
C /run/systemd
C /run/systemd/resolve
C /run/systemd/resolve/stub-resolv.conf
$

(The /etc there is really /etc/hostname; not sure why the diff is apparently recursive in the /run case but not the /etc case)

It's not just the presence of this cruft that's problematic, it's that the build process serializes the current time into the tar stream for them, which defeats reproducible builds.

Now, running podman build --timestamp=<something> will paper over this; but that's a big/crude hammer, and while I've been recommending it in some places I am pretty sure it can easily introduce the same issues with e.g. Python that we've seen in ostree (ref ostreedev/ostree#1469 ).


(Time passes)

Oh hey, I went to double check vs the latest docker (26.1.4), and it has a different variant of this bug where it apparently serializes just the top-level mount directories it injected at build time:

$ tar tvf test/blobs/sha256/1ed2a*
drwxr-xr-x 0/0               0 2024-06-16 07:13 etc/
drwxr-xr-x 0/0               0 2024-06-16 07:13 proc/
drwxr-xr-x 0/0               0 2024-06-16 07:13 sys/
-rw-r--r-- 0/0              12 2023-05-18 22:34 test.txt

I'm pretty sure this is a regression on their side, but not sure I care enough to dig up the version of docker I used in 2021 to double check.

@cgwalters cgwalters added the kind/bug Categorizes issue or PR as related to a bug. label Jun 16, 2024
@cgwalters
Copy link
Author

The /etc there is really /etc/hostname; not sure why the diff is apparently recursive in the /run case but not the /etc case)

Ahh, this logic ultimately comes from fbd1392a46558eb4adb368ba37fdce2b45013c1f which tried to paper over this underlying bug.

@mheon
Copy link
Member

mheon commented Jun 18, 2024

@giuseppe @nalind PTAL

cgwalters added a commit to cgwalters/bootc that referenced this issue Jun 18, 2024
- Pass `--timestamp` to podman build to squash timestamps in
  order to gain reproducibilty, working around containers/buildah#5592
- Fix the xattr reading code to correctly skip trailing nils
cgwalters added a commit to cgwalters/bootc that referenced this issue Jun 18, 2024
- Pass `--timestamp` to podman build to squash timestamps in
  order to gain reproducibilty, working around containers/buildah#5592
- Fix the xattr reading code to correctly skip trailing nils

Signed-off-by: Colin Walters <[email protected]>
Copy link

A friendly reminder that this issue had no activity for 30 days.

@cgwalters
Copy link
Author

Ah sorry there are duplicate issues, closing in favor of #4242

@cgwalters cgwalters closed this as not planned Won't fix, can't repro, duplicate, stale Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants