Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for GITHUB_PAT handling with private repo dependencies #40

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Generated by roxygen2: do not edit by hand

export("%|0|%")
export(Dockerfile)
export(dock_from_desc)
export(dock_from_renv)
Expand Down
77 changes: 67 additions & 10 deletions R/dock_from_desc.R
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,9 @@ base_pkg_ <- c(
#' @param FROM The FROM of the Dockerfile. Default is
#' FROM rocker/r-ver:`R.Version()$major`.`R.Version()$minor`.
#' @param AS The AS of the Dockerfile. Default it NULL.
#' @param sha256 character. The Digest SHA256 hash corresponding to the chip architecture of the deployment host machine if different than the machine on which the image will be built.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on this parameter sha256 ?
How is this different from direct use of:

dock_from_desc(
  FROM = rocker/rver@sha256:xxxxxxxxx
)

If this is really needed, could you add information on how to create it and why in a "Details" section ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @statnmap, this is no different from the direct use as you've indicated. It's provided as an argument to prompt a user that might not know they need to specify the build architecture when the host machine is different from the build machine. I've added better documentation in a7c3658 and added the argument to docker_from_renv for parity across the two functions.

#' @param sysreqs boolean. If TRUE, the Dockerfile will contain sysreq installation.
#' @param use_suggests boolean. If TRUE (the default), include dependencies listed in Suggests field in DESCRIPTION.
#' @param repos character. The URL(s) of the repositories to use for `options("repos")`.
#' @param expand boolean. If `TRUE` each system requirement will have its own `RUN` line.
#' @param build_from_source boolean. If `TRUE` no tar.gz is created and
Expand Down Expand Up @@ -64,7 +66,9 @@ dock_from_desc <- function(
".",
R.Version()$minor
),
sha256 = NULL,
AS = NULL,
use_suggests = TRUE,
sysreqs = TRUE,
repos = c(CRAN = "https://cran.rstudio.com/"),
expand = FALSE,
Expand All @@ -74,7 +78,10 @@ dock_from_desc <- function(
) {
path <- fs::path_abs(path)

packages <- desc_get_deps(path)$package
packages <- desc_get_deps(path)
if (!use_suggests)
packages <- packages[packages$type != "Suggests",]
packages <- packages$package
packages <- packages[packages != "R"] # remove R
packages <- packages[!packages %in% base_pkg_] # remove base and recommended

Expand Down Expand Up @@ -127,10 +134,10 @@ dock_from_desc <- function(
packages
)

packages_not_on_cran <- setdiff(
packages_not_on_cran <- sort(setdiff(
packages,
packages_on_cran
)
))

packages_with_version <- data.frame(
package = remotes_deps$package,
Expand All @@ -146,6 +153,11 @@ dock_from_desc <- function(
packages_with_version$package
)

packages_on_cran <- packages_on_cran[order(names(packages_on_cran))]
# Add SHA for Architecture
if (!is.null(sha256))
FROM <- paste0(FROM, "@sha256:", sha256)

dock <- Dockerfile$new(
FROM = FROM,
AS = AS
Expand Down Expand Up @@ -206,7 +218,7 @@ dock_from_desc <- function(
}

if (length(packages_not_on_cran > 0)) {
nn <- as.data.frame(
nn_df <- as.data.frame(
do.call(
rbind,
lapply(
Expand All @@ -218,26 +230,46 @@ dock_from_desc <- function(
)
)

nn <- sprintf(
"%s/%s",
nn_df$username,
nn_df$repo
)

repo_status <- lapply(nn, repo_get)
ind_private <- sapply(repo_status, function(x) x$visibility == "private") %|0|% FALSE
if (any(ind_private)) {
dock$ARG("GITHUB_PAT")
dock$RUN("GITHUB_PAT=$GITHUB_PAT")
}


nn <- sprintf(
"%s/%s@%s",
nn$username,
nn$repo,
nn$sha
nn_df$username,
nn_df$repo,
nn_df$sha
)


pong <- mapply(
function(dock, ver, nm) {
function(dock, ver, nm, i) {
fmt <- "Rscript -e 'remotes::install_github(\"%s\")'"
if (i)
fmt <- paste("GITHUB_PAT=$GITHUB_PAT", fmt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we set GITHUB_PATas a global environment variable instead when building the Docker using the ARG parameter ?
You could set a ARG GITHUB_PAT=default that would be modified with build-arg.
See for instance: https://stackoverflow.com/questions/39597925/how-do-i-set-environment-variables-during-the-build-in-docker

Indeed, this would not prevent us to configure {dockerfiler} for other sources than GitHub.
At some point, we will use {attachment} to deal with other repositories sources (ThinkR-open/attachment#56)

Copy link
Author

@yogat3ch yogat3ch Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @statnmap,

You could set a ARG GITHUB_PAT=default that would be modified with build-arg.
See for instance: https://stackoverflow.com/questions/39597925/how-do-i-set-environment-variables-during-the-build-in-docker

That actually doesn't work. The GITHUB_PAT Environment variable has to be set each time R is invoked, otherwise it's not available to the R session when installing from private repositories.
The drawback of this method (used in this PR) is it actually exposes the GITHUB_PAT in the build log. If folks are using docker.io, most images are defaulted to public visibility after upload, thus exposing the GITHUB_PAT after upload to docker.io.

I've been thinking through how best to ensure that the installation of remote private repos goes smoothly and the best way that I can think of seems to be to copy an .Renviron file with the GITHUB_PAT variable set in it to the image while the installation of packages takes place.

This could be accomplished by adding an argument that allows the user to copy the .Renviron to the image. If they opt to copy the .Renviron then the GITHUB_PAT variable is just appended. If they do not opt to copy the .Renviron file, then the .Renviron file with just the GITHUB_PAT can be removed via RUN rm .Renviron at the end of the package installation process in the Dockerfile.

What are your thoughts on this method of copying the .Renviron with the GITHUB_PAT to the image either permanently or temporarily based on user preference?

res <- dock$RUN(
sprintf(
"Rscript -e 'remotes::install_github(\"%s\")'",
fmt,
ver
)
)
},
ver = nn,
i = ind_private,
MoreArgs = list(dock = dock)
)
cat_red_bullet(glue::glue("Must add `--build-arg GITHUB_PAT={remotes:::github_pat()}` to `docker build` call. Note that the GITHUB_PAT will be visible in this image metadata. If uploaded to Docker Hub, the visibility must be set to private to avoid exposing the GITHUB_PAT."))
dock$custom("#", "Must add `--build-arg GITHUB_PAT=[YOUR GITHUB PAT]` to `docker build` call")
}

if (!build_from_source) {
Expand Down Expand Up @@ -303,7 +335,10 @@ dock_from_desc <- function(
dock$RUN("mkdir /build_zone")
dock$ADD(from = ".", to = "/build_zone")
dock$WORKDIR("/build_zone")
dock$RUN("R -e 'remotes::install_local(upgrade=\"never\")'")
run <- "R -e 'remotes::install_local(upgrade=\"never\")'"
if (any(get0("ind_private", inherits = FALSE)))
run <- paste("GITHUB_PAT=$GITHUB_PAT", run)
dock$RUN(run)
dock$RUN("rm -rf /build_zone")
}
# Add a dockerignore
Expand Down Expand Up @@ -331,3 +366,25 @@ repos_as_character <- function(repos) {

repos_as_character
}

#' @noRd
repo_get <- function(repo) {
jsonlite::fromJSON(suppressMessages(system(glue::glue("curl -H \"Accept: application/vnd.github+json\" -H \"Authorization: token {remotes:::github_pat()}\" https://api.github.com/repos/{repo}"), intern = TRUE)))
}

#' Replace zero-length values in a vector
#' @rdname op-zero-default
#' @param x \code{vctr}
#' @param y \code{object} to replace zero-length values. Must be of same class as x
#'
#' @return \code{vctr}
#' @export
#'
#' @examples
#' c(TRUE, FALSE, logical(0)) %|0|% FALSE
`%|0|%` <- Vectorize(function(x, y) {
if (rlang::is_empty(x))
y
else
x
})
6 changes: 6 additions & 0 deletions man/dockerfiles.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 22 additions & 0 deletions man/op-zero-default.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.