Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flawed logic in add_dockerfile() #17

Open
ColinFay opened this issue Sep 24, 2021 · 1 comment
Open

Flawed logic in add_dockerfile() #17

ColinFay opened this issue Sep 24, 2021 · 1 comment

Comments

@ColinFay
Copy link
Member

I am working on a PR touching the add_dockerfile_XXX functions. They all depend on the internal dock_from_desc function which needed a good clean up. There is a big inconsistency between CRAN and non-CRAN packages.

We have some "direct dependencies", i.e. the packages listed in DESCRIPTION. Some may be from CRAN and some from other services. We also have "extended dependencies" which are the direct dependencies of the direct dependencies.

Currently, add_dockerfile will install all direct dependencies and no indirect dependencies.

Unless... we happen to have a non-CRAN direct dependency, in which case all non-CRAN indirect dependencies will be added to the Dockerfile.

To reproduce the bug:

  • create a new golem project
  • remove all dependencies in the description
  • pick a package you've installed from CRAN and add it in Imports -- I used ggplot2
  • install a dependency of ggplot2 from github -- for example remotes::install_github("r-lib/prettyunits")
  • call golem::add_dockerfile

The dockerfile only contains an install of ggplot2, with no mention of prettyunits or any other dependencies.

RUN R -e 'install.packages("remotes")'
RUN Rscript -e 'remotes::install_version("ggplot2",upgrade="never", version = "3.3.2")'

Now, add another package installed from github as an import. Let's say we install golem with remotes::install_github("ThinkR-open/golem")

Your imports look like:

Imports: 
  ggplot2,
  golem

Generate a new dockerfile with golem::add_dockerfile() and this times it contains:

RUN R -e 'install.packages("remotes")'
RUN Rscript -e 'remotes::install_version("ggplot2",upgrade="never", version = "3.3.2")'
RUN Rscript -e 'remotes::install_github("ThinkR-open/golem@aaae5c8788802a7b4aef4df23691902a286dd964")'
RUN Rscript -e 'remotes::install_github("r-lib/rlang@4e2bbe18695d9fd17951e9d7ae92535eb095ff6d")'
RUN Rscript -e 'remotes::install_github("r-lib/prettyunits@b1cdad88b900cabb1622a1dfd2c7e6361c2a66f2")'
RUN Rscript -e 'remotes::install_github("r-lib/lifecycle@355dcba8530bcce57c15847165ca568f9f81b43e")'
RUN Rscript -e 'remotes::install_github("antoine-sachet/usethis@02e1e76519ec0415dfa8b1ac34ca881104be8c22")'
# The number of lines you get will depend on how many packages you have installed from github

That's obviously a bug, but it got me thinking: is it a good idea to set in stone the indirect dependencies?

Instead of fixing the bug and removing all indirect dependencies, maybe we should actually add all indrect dependencies, CRAN or not?

Pros:

  • More robust. If an indirect dependency is updated between 2 deployments and something breaks, the app won't be impacted as it will use a known-good version.

Cons:

  • I don't actually want to use all those dependencies from github, I'd rather use the CRAN version when I deploy!
  • the dockerfile may grow out of control, or at least be much bigger than it used to be
  • it is more intuitive for the end-user to have a one-to-one mapping between DESCRIPTION and Dockerfile

Note both options are a potentially breaking change.

@ColinFay @VincentGuyader Any thoughts? Apologies for the long message!

Migrated from ThinkR-open/golem#533

@ColinFay
Copy link
Member Author

Original issue from @antoine-sachet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant