Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for extra environment variables #1214

Open
sgaist opened this issue Nov 3, 2022 · 0 comments
Open

Support for extra environment variables #1214

sgaist opened this issue Nov 3, 2022 · 0 comments

Comments

@sgaist
Copy link
Contributor

sgaist commented Nov 3, 2022

Proposed change

This is in the context of the pip buildpack and private pypi repositories.

There are cases where people are mirroring packages to their own repository or are using forges like GitLab to host packages. For various reasons (internal policy, security, etc), these package repositories may be only accessible using a password or token. The use of a combination of --trusted-host and --extra-index-url or their environment variable counterparts (PIP_TRUSTED_HOST and PIP_EXTRA_INDEX_URL) will allow to use them. However there's currently no way to pass this down to the Dockefile that is generated by repo2docker.

Based on the pack command, we could re-use their ideas of having a --env and --env-file argument. The former allows to either specify the variable with a value and if the value is empty, it will be taken from the caller environment. The latter reads a file containing the classic VAR=VALUE and again if no value, it will be taken from the caller environment.

One main issue with all of these is that, in any case, the content of the information passed to build the Docker image will be part of one layer which might not be what is wanted. For example, private registry requiring credentials. Using a multistage build could be a solution but it will require refactoring the generated Dockerfile. The reproducibility could be considered harder to achieve as some elements could be considered as missing.

A possible implementation could be:

  1. Create a "core" image as "base_image"
  2. Use it to install user defined stuff using credentials if needed using "base_image"
  3. Build final image from "base_image" by copying the packages installed in step 2 without any of the build only variables

Alternative options

  1. Create a custom BuildPack that would parse relevant environment variables from the where repo2docker is called (for example all PIP_ for the PythonBuildPack)
  2. Related to the above, change the current behaviour of the corresponding BuildPack classes to do that
  3. Add a different parameter that triggers the parsing of the environment where the command is called from.
  4. Add a extra_build_env counterpart to extra_build_args. The get_build_env from the BuildPack class could then parse the content of that argument.

AFAIK, number 1 is currently not yet possible based on the #487 discussion (one relevant comment).

Number 2 would be too invasive with regard to backward compatibility. I would also venture that having explicit CLI arguments is better to ensure that people can easily understand what is going on.

Number 3 does not feel right.

Number 4 would be a essentially a renamed and stripped down version of the original proposition.

Who would use this feature?

People with the need to access package repositories other than pypi.

How much effort will adding it take?

Depending on the agreed upon solution:

  • Without taking the privacy of the environment variable content into consideration, the work would likely be entry level
  • Moving to a multistage build would require more skills and knowledge that would make the task between intermediate and hard.

Who can do this work?

  • Python (environment parsing, file parsing)
  • Dockerfile structure
  • Templating with jinja2
  • pip
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant