Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARG/ENV used in script does not invalidate build cache #1688

Closed
madawei2699 opened this issue Jul 8, 2021 · 4 comments · Fixed by #1693
Closed

ARG/ENV used in script does not invalidate build cache #1688

madawei2699 opened this issue Jul 8, 2021 · 4 comments · Fixed by #1693

Comments

@madawei2699
Copy link
Contributor

Actual behavior
ARG/ENV used in script does not invalidate build cache, e.g.

Dockerfile:

FROM ubuntu
ADD ./test.sh /
RUN cat /test.sh
ARG CONT_IMG_VER
RUN sh /test.sh

CMD ["echo", "test cache"]

test.sh

#!/bin/sh
echo 'start test'
echo ${CONT_IMG_VER} > /tmp/test.txt

when use kaniko to build image by given different ARGs, it will use cache layer.

Expected behavior
when use docker to build image by given different ARGs, it will NOT use cache layer. e.g.

docker build -t hello_test . --build-arg CONT_IMG_VER=4444
docker build -t hello_test . --build-arg CONT_IMG_VER=333

Output

......
 => [1/4] FROM docker.io/library/ubuntu@sha256:aba80b77e27148d99c034a987e7da3a287ed455390352663418c0f2ed40417fe                                                                                                                          0.0s
 => [internal] load build context                                                                                                                                                                                                        0.0s
 => => transferring context: 28B                                                                                                                                                                                                         0.0s
 => CACHED [2/4] ADD ./test.sh /                                                                                                                                                                                                         0.0s
 => CACHED [3/4] RUN cat /test.sh                                                                                                                                                                                                        0.0s
 => [4/4] RUN sh /test.sh (!!this step not be cached which is expected!!)                                                                                                                                                                                                               0.7s
 => exporting to image                                                                                                                                                                                                                   0.1s
 => => exporting layers                                                                                                                                                                                                                  0.1s
 => => writing image sha256:701b17959ecb5594f6cc1cbc136d107dbe8892432c0361505abce6cf9df1f7d0                                                                                                                                             0.0s
 => => naming to docker.io/library/hello_test

To Reproduce
Steps to reproduce the behavior:

  1. Create Dockerfile with content
    Dockerfile:
FROM ubuntu
ADD ./test.sh /
RUN cat /test.sh
ARG CONT_IMG_VER
RUN sh /test.sh

CMD ["echo", "test cache"]

test.sh

#!/bin/sh
echo 'start test'
echo ${CONT_IMG_VER} > /tmp/test.txt
  1. Build image first time with arg/env var CONT_IMG_VER=1111 (enable cache flag)
  2. Build image second time with arg/env var CONT_IMG_VER=2222 (enable cache flag)
  3. Expected RUN cat /test.sh NOT to use cache layer (build log can not see Found cached layer, extracting to filesystem)

Additional Information

  • Dockerfile
FROM ubuntu
ADD ./test.sh /
RUN cat /test.sh
ARG CONT_IMG_VER
RUN sh /test.sh

CMD ["echo", "test cache"]

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@madawei2699
Copy link
Contributor Author

After some code investigation, I know that kaniko compute cache layer CompositeKey by command(RUN/COPY), ARG, ENV. That means it not use script which has ARG called by command. So it can invalidate cache by this way:

FROM ubuntu
ADD ./test.sh /
RUN cat /test.sh
ARG CONT_IMG_VER
RUN echo ${CONT_IMG_VER} && sh /test.sh

CMD ["echo", "test cache"]

The point is by using echo ${CONT_IMG_VER} which triggers to invalidate cache by using ARG in command.

I do not know if there is some better way to solve this issue cuz this code is hard to understand by who has no context, pls let me know if you have better idea.

@PMExtra
Copy link

PMExtra commented Aug 5, 2021

Same issue. I'm looking forward to this fix.

@zzh8829
Copy link

zzh8829 commented Dec 17, 2021

This issue is introduce by #1008 #1085 I added my comments over the old PR but ill reiterate here. the current behavior is introduced in that PR as a fix to people complaining ARG causing excessive cache invalidation. But according do dockerfile standard

https://docs.docker.com/engine/reference/builder/#impact-on-build-caching

In particular, all RUN instructions following an ARG instruction use the ARG variable implicitly (as an environment variable), thus can cause a cache miss. All predefined ARG variables are exempt from caching unless there is a matching ARG statement in the Dockerfile.

In the old fix, only explicit ARG usage are treated as cache miss where in the docker standard all RUN statement should in fact trigger cache miss because any scripts in RUN could use ARG without explicit reference.

@PMExtra
Copy link

PMExtra commented Dec 27, 2021

This issue is introduce by #1008 #1085 I added my comments over the old PR but ill reiterate here. the current behavior is introduced in that PR as a fix to people complaining ARG causing excessive cache invalidation. But according do dockerfile standard

https://docs.docker.com/engine/reference/builder/#impact-on-build-caching

In particular, all RUN instructions following an ARG instruction use the ARG variable implicitly (as an environment variable), thus can cause a cache miss. All predefined ARG variables are exempt from caching unless there is a matching ARG statement in the Dockerfile.

In the old fix, only explicit ARG usage are treated as cache miss where in the docker standard all RUN statement should in fact trigger cache miss because any scripts in RUN could use ARG without explicit reference.

Maybe we can provide an option to indicate whether ARG exempt from caching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants