Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for GITHUB_PAT handling with private repo dependencies #40

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

yogat3ch
Copy link

Hi @VincentGuyader & @ColinFay,
This PR will eventually fix #18.
It currently only modifies dock_from_desc to add handling of the GITHUB_PAT as a build-arg to enable fetching of private Github repos during docker build.
A comment is included in the Dockerfile reminding the user to use the --build-arg GITHUB_PAT=[github PAT] flag when running docker build.
There is an info message also indicating this fact which also informs the user that using this method causes the GITHUB_PAT to be exposed in the image metadata and thus the image must be kept private if uploaded to Docker Hub.

I would appreciate feedback on this approach thus far before I implement a similar method for handling private repos on the dock_from_renv function.

Can y'all let me know if this is satisfactory?

@yogat3ch
Copy link
Author

yogat3ch commented Jul 19, 2022

Just added a couple more features:

  • Added support for specifying a sha256 hash of the rocker version to control the architecture of the image. Related to golem#885
  • Added support for excluding the dependencies specified in Suggests in docker_from_desc for smaller docker images.
  • Alphabetizes the dependencies so it's easier to locate dependencies in the Dockerfile

@michkam89
Copy link

Hello there, I'm just curious if this is going to be merged? I'm thinking about working on #43 and I could use this code :)

@yogat3ch
Copy link
Author

@michkam89 , tagging @VincentGuyader here to get some eyes on this

Copy link
Member

@statnmap statnmap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the proposition. It is true that locally we never faced GitHub restrictions on our tests.
However, I think you proposition could be simplified if you specify an global environment variable in your Docker build. {remotes} knows where to find it while it is specified.
I did not tested it directly, but a way to look at it is probably how to deal with ARG

@@ -36,7 +36,9 @@ base_pkg_ <- c(
#' @param FROM The FROM of the Dockerfile. Default is
#' FROM rocker/r-ver:`R.Version()$major`.`R.Version()$minor`.
#' @param AS The AS of the Dockerfile. Default it NULL.
#' @param sha256 character. The Digest SHA256 hash corresponding to the chip architecture of the deployment host machine if different than the machine on which the image will be built.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on this parameter sha256 ?
How is this different from direct use of:

dock_from_desc(
  FROM = rocker/rver@sha256:xxxxxxxxx
)

If this is really needed, could you add information on how to create it and why in a "Details" section ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @statnmap, this is no different from the direct use as you've indicated. It's provided as an argument to prompt a user that might not know they need to specify the build architecture when the host machine is different from the build machine. I've added better documentation in a7c3658 and added the argument to docker_from_renv for parity across the two functions.

function(dock, ver, nm, i) {
fmt <- "Rscript -e 'remotes::install_github(\"%s\")'"
if (i)
fmt <- paste("GITHUB_PAT=$GITHUB_PAT", fmt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we set GITHUB_PATas a global environment variable instead when building the Docker using the ARG parameter ?
You could set a ARG GITHUB_PAT=default that would be modified with build-arg.
See for instance: https://stackoverflow.com/questions/39597925/how-do-i-set-environment-variables-during-the-build-in-docker

Indeed, this would not prevent us to configure {dockerfiler} for other sources than GitHub.
At some point, we will use {attachment} to deal with other repositories sources (ThinkR-open/attachment#56)

Copy link
Author

@yogat3ch yogat3ch Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @statnmap,

You could set a ARG GITHUB_PAT=default that would be modified with build-arg.
See for instance: https://stackoverflow.com/questions/39597925/how-do-i-set-environment-variables-during-the-build-in-docker

That actually doesn't work. The GITHUB_PAT Environment variable has to be set each time R is invoked, otherwise it's not available to the R session when installing from private repositories.
The drawback of this method (used in this PR) is it actually exposes the GITHUB_PAT in the build log. If folks are using docker.io, most images are defaulted to public visibility after upload, thus exposing the GITHUB_PAT after upload to docker.io.

I've been thinking through how best to ensure that the installation of remote private repos goes smoothly and the best way that I can think of seems to be to copy an .Renviron file with the GITHUB_PAT variable set in it to the image while the installation of packages takes place.

This could be accomplished by adding an argument that allows the user to copy the .Renviron to the image. If they opt to copy the .Renviron then the GITHUB_PAT variable is just appended. If they do not opt to copy the .Renviron file, then the .Renviron file with just the GITHUB_PAT can be removed via RUN rm .Renviron at the end of the package installation process in the Dockerfile.

What are your thoughts on this method of copying the .Renviron with the GITHUB_PAT to the image either permanently or temporarily based on user preference?

@yogat3ch
Copy link
Author

Hey @statnmap, thanks for the review here! I haven't had a chance to implement the changes yet but I'll have some time off over the holidays where I can hopefully get to it!

@yogat3ch
Copy link
Author

Hey @statnmap , I had an opportunity to document the sha256 parameter in the preceding commits. Is there anything else?

@yogat3ch
Copy link
Author

yogat3ch commented Mar 1, 2023

Hey @statnmap,
I think a safer way to do this is to copy an .Renviron file to the Docker image temporarily with the GITHUB_PAT set therein, and then delete it after renv::restore is run in the build process with RUN rm .Renviron cmd in the Dockerfile.
This avoids exposing the GITHUB_PAT` in the run log when the image is uploaded to Docker Hub.
Should I implement this instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add_dockerfile() should use GITHUB_PAT when calling remotes::install_github
4 participants