Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installing binaries when collaborating across multiple operating systems #1052

Open
noamross opened this issue Jul 26, 2022 · 18 comments
Open
Labels
feature a feature request or enhancement repositories 📚 update ⬆️
Milestone

Comments

@noamross
Copy link

When on an renv enabled project where collaborators or machines may be Linux or macOS, one can generally get binaries to install on Linux or macOS machines, but not both. This is because CRAN hosts macOS binaries, and RSPM hosts Linux binaries, but in renv.lock individual packages are designated by a single repository, even if the package source hash would be identical across repositories. I assume similar issues arise with Windows. I would like there to be a configuration in which the binary version of a package, hosted from either repository, can be preferred for installation on any OS.

In my setup, collaborators may be on Linux or macOS machines, and we use/commit project-level .Rprofile files as well as renv.lock files to keep our environments in sync. The renv-relevant part of the .Rprofile is this:

options(
  repos = c(RSPM = "https://packagemanager.rstudio.com/all/latest",
            CRAN = "https://cran.rstudio.com/")
  renv.config.auto.snapshot = TRUE, 
  renv.config.rspm.enabled = TRUE, 
  renv.config.cache.enabled = TRUE
)

source("renv/activate.R")

And so we have renv.lock files that look like this, but the repository source of a package (CRAN/RSPM) generally depends on whether the user who committed the change is on Linux or macOS:

{
  "R": {
    "Version": "4.2.1",
    "Repositories": [
      {
        "Name": "RSPM",
        "URL": "https://packagemanager.rstudio.com/all/latest"
      },
      {
        "Name": "CRAN",
        "URL": "https://cran.rstudio.com"
      }
    ]
  },
  "Packages": {
    "BH": {
      "Package": "BH",
      "Version": "1.78.0-0",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "4e348572ffcaa2fb1e610e7a941f6f3a",
      "Requirements": []
    },
...

I note I would expect renv.lock to override repository preferences in .Rprofile when running renv::restore(), and the repositories set in .Rprofile() to determine the source of new packages installed into the library. Perhaps packages in renv.lock can have multiple Repository values and there can be an renv.prefer.binary.repository = TRUE option.

(Migrating from this post on the RStudio forum: https://community.rstudio.com/t/renv-and-rspm-for-binary-installs-on-shared-linux-mac-project/143054/2 by request of @kevinushey)

@kevinushey
Copy link
Collaborator

Thanks for the bug report!

In the interim, a workaround would be something like:

if (Sys.info()[["sysname"]] == "Darwin")
  options(renv.config.repos.override = <...>)

In other words, you could use the repos override option to change the repositories used during restore / install by renv, in a way that won't be considered on a future snapshot.

@mungojam
Copy link

mungojam commented Dec 7, 2022

The hash is the real blocker here I believe. pipfile.lock and the terraform lock file both allow for multiple hashes for a single package version so you can have all the OS hashes listed there and available to be restored.

The renv equivalent hash property doesn't appear to be an array and is just a single hash value. Or does it support being an array under the covers?

@kevinushey
Copy link
Collaborator

renv computes the hash based on (a mildly transformed version of) the package's DESCRIPTION file, and so the hash should be the same across platforms.

@mungojam
Copy link

mungojam commented Dec 8, 2022

renv computes the hash based on (a mildly transformed version of) the package's DESCRIPTION file, and so the hash should be the same across platforms.

Ok that's good news for us now. That doesn't offer any security against an altered package unless the attacker has also altered the DESCRIPTION file, right? Sorry, off topic.

Back on topic, we're currently struggling to get the workaround to work and still receiving source packages on Ubuntu. I'll get my colleague to share details later

@noamross
Copy link
Author

noamross commented Dec 8, 2022

Putting this in the shared project .Rprofile works for us:

options(
  repos = c(RSPM = "https://packagemanager.rstudio.com/all/latest",
            CRAN = "https://cran.rstudio.com/"),
  renv.config.auto.snapshot = TRUE
)

# Since RSPM does not provide Mac binaries, always install packages from CRAN
# on mac or windows, even if renv.lock specifies they came from RSPM
if (Sys.info()[["sysname"]] %in% c("Darwin", "Windows")) {
  options(renv.config.repos.override = c(
    CRAN = "https://cran.rstudio.com/",
    INLA = "https://inla.r-inla-download.org/R/testing"))
} else if (Sys.info()[["sysname"]] == "Linux") {
  options(renv.config.repos.override = c(
    RSPM = "https://packagemanager.rstudio.com/all/latest",
    INLA = "https://inla.r-inla-download.org/R/testing"))
}

@rahultoora
Copy link

Thanks for sharing @noamross. However implementing the above still results in receiving packages from source for Linux. Were you able to get installs from binaries on linux?

@kevinushey
Copy link
Collaborator

Ok that's good news for us now. That doesn't offer any security against an altered package unless the attacker has also altered the DESCRIPTION file, right? Sorry, off topic.

That's right. In general, renv assumes that the package repositories set in the R session are trusted, and so the packages retrieved from those repositories are also similarly trusted.

However implementing the above still results in receiving packages from source for Linux.

What version of renv are you using? Normally (if RSPM integration is enabled) renv should automatically translate source RSPM URLs into binary RSPM URLs for you. See https://rstudio.github.io/renv/reference/config.html#renv-config-rspm-enabled for more details. That would happen e.g. here:

renv/R/retrieve.R

Lines 17 to 21 in 66f339c

# transform repository URLs for RSPM
if (renv_rspm_enabled()) {
repos <- getOption("repos")
renv_scope_options(repos = renv_rspm_transform(repos))
}

If you're still having trouble, then a similar solution explicitly checking the operating system and setting the binary repository explicitly should suffice, based on the contents of the /etc/os-release file (if it exists).

@rahultoora
Copy link

rahultoora commented Dec 12, 2022

@kevinushey just FYI @mungojam and I are colleagues.

To answer you questions:

  • renv version = 0.16.0
  • RSPM integration is enabled by default so I have not explicitly configured it
  • OS I am working wit: Windows and Linux (Ubuntu 18.04 (Bionic))

This is my .Rprofile wondering if im missing something:

source("renv/activate.R")

options(
  repos = c(RSPM = "https://packagemanager.rstudio.com/cran/__linux__/bionic/latest",
            CRAN = "https://cran.rstudio.com/"),
  renv.config.auto.snapshot = TRUE
)


if (Sys.info()[["sysname"]] == "Windows") {
  options(renv.config.repos.override = c(
    CRAN = "https://cran.rstudio.com/"))
} else if (Sys.info()[["sysname"]] == "Linux") {
  options(renv.config.repos.override = c(
    RSPM = "https://packagemanager.rstudio.com/cran/__linux__/bionic/latest"))
}

Version of R = 4.1.2 (cannot use a more recent version due to project requirements)

Important to note I am only installing standard packages e.g: dplyr, readr, data.table, dependencies for R notebooks etc.

@kevinushey
Copy link
Collaborator

That looks correct to me, assuming all Linux machines are running Ubuntu Bionic. Are you still seeing issues even with this set? If not, can you clarify what you're seeing when you run renv::restore()?

@rahultoora
Copy link

rahultoora commented Dec 12, 2022

Hi @kevinushey

For reference here is a simple repo I have created for this example only utilising the dplyr and readr packages and the associated NB packages

When pulling the repo and opening the project from a fresh Linux (Ubuntu 18.04 (Bionic)) instance and running renv::restore()
I see the following:

renv_restore1

image

It is downloading the source files no?

Is it building the binaries and caching them?

image

Note: I have created the RSPM repo URL from: Posit Package Manager

@kevinushey
Copy link
Collaborator

kevinushey commented Dec 12, 2022

On Linux, RSPM serves binary packages "disguised" as source packages (when using a binary repository URL; e.g. one that includes __linux__) so even though the package URLs look like they're coming from a source repository, they are really binaries (and the output from renv confirms that)

@rahultoora
Copy link

Okay thanks for clearing that up!

@black-snow
Copy link

Could anyone post an example for what to put into .Rprofile (or some other place) to get binary-first installs on all platforms? Installation takes ages for me and burns my CPU, and options(pkgType = "both") seems to err out on Linux.

@lorenzwalthert
Copy link

lorenzwalthert commented Aug 1, 2023

I think since Posit package manager now serves binaries for all three major platforms, the problem is almost irrelevant for most people. At least for my use case (package {precommit}), binary installs works on all platforms out of the box when I specify PPM as my first CRAN repo.

@kevinushey
Copy link
Collaborator

Could anyone post an example for what to put into .Rprofile (or some other place) to get binary-first installs on all platforms? Installation takes ages for me and burns my CPU, and options(pkgType = "both") seems to err out on Linux.

If you're using renv, it should suffice to just do something like:

options(repos = c(PPM = "https://packagemanager.posit.co/cran/latest"))

renv will automatically transform that to the appropriate binary repository URL when available for the current platform. If that's not working for some reason, then you can manually select a binary repository URL as from https://packagemanager.posit.co/client/#/repos/2/overview.

I think since Posit package manager now serves binaries for all three major platforms, the problem is almost irrelevant for most people.

The big missing piece is macOS binaries on arm64; once those are ready we can confidently use PPM by default in all cases.

@harrismcgehee
Copy link

harrismcgehee commented May 7, 2024

We are adding Posit Package Manager to our stack.

What is your recommendation for when the renv.lock repos are "out-of-date" and don't match the enterprise configuration?

renv::restore(repos = getOption("repos")) isn't working for us because the old repos are activated by the .Rprofile

Thanks.

@kevinushey
Copy link
Collaborator

If you need to only update the repositories used in the lockfile, you can use:

renv::lockfile_modify(repos = <...>)

(or, alternatively, edit the lockfile by hand to update the repository URLs)

Note also that renv supports automatic transformation of PPM source repository URLs into binary repository URLs. If you plan to only ever use binary repository URLs for a specific platform, you might want to disable that bit of renv integration. Please see https://rstudio.github.io/renv/reference/config.html#renv-config-ppm-enabled for more details.

@harrismcgehee
Copy link

Fabulous. I hadn't seen that function released.

I think this is what I'll have users run.


# Read the repository configuration file
repos_conf <- strsplit(readLines("/etc/rstudio/repos.conf"), "=")

# Create named repo vector for CRAN and Internal repo
repos_vec <- setNames(c(repos_conf[[1]][2],
                        repos_conf[[2]][2]),
                      c(repos_conf[[1]][1],
                        repos_conf[[2]][1]))

# Avoiding pipes because we still support R 4.0
renv::lockfile_write(
    renv::lockfile_modify(
        lockfile = renv::lockfile_read("renv.lock"),
        repos = repos_vec),
    "renv.lock")
renv::restore()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement repositories 📚 update ⬆️
Projects
None yet
Development

No branches or pull requests

8 participants