Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommended set-up for r-spatial packages on Linux #35

Open
Robinlovelace opened this issue Mar 1, 2020 · 15 comments
Open

Recommended set-up for r-spatial packages on Linux #35

Robinlovelace opened this issue Mar 1, 2020 · 15 comments

Comments

@Robinlovelace
Copy link
Contributor

Robinlovelace commented Mar 1, 2020

Some people have asked me what repositories and even Linux distributions are recommended as a computing foundation for packages such as sf. I am a long time Ubuntu user and have generally found that the ubuntugis-unstable repository provides a reasonable balance between being up-to-date whilst not not getting too close to the bleeding edge (currently GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1). Based on this positive experience, and the fact that QGIS seems to work relatively well with ubuntugis versions, I generally advise that set-up, e.g. as described in this lucid tutorial: https://rtask.thinkr.fr/installation-of-r-3-5-on-ubuntu-18-04-lts-and-tips-for-spatial-packages/

However, I'm wondering if that advice should be re-considered in the light of various changes:

  • to GDAL and PROJ libraries on which sf depends, e.g. as discussed here (discussion of completed of updates for GDAL 2.5 and PROJ 6 are mentioned here)
  • the coming release of R 4.0.0, which will make that tutorial out-of-date
  • the coming release of Ubuntu 20.04 LTS, which will also make the tutorial out-of-date and potentially mean that ubuntugis-unstable is no longer needed, as per the sf README
  • the fact that Ubuntu is not the only popular Linux distro

In anticipation of more people asking and based on the assumption that other people are wondering, I thought it worth asking: what would r-spatial developers and users recommend?

Specifically:

  1. Which distributions are easiest for Linux beginners to set-up and maintain?
  2. What are the +s and -s of using default version of libraries vs repositories such as UbuntuGIS?
  3. Likewise what are the +s and -s of building dependencies such as GDAL from source?
  4. What about docker images, e.g. the set-up in the rocker/geospatial? (That builds on the rocker/versioned Dockerfile which is based on Debian Buster and uses the default versions of upstream libraries --- currently GEOS 3.7.1, GDAL 2.4.0, PROJ 5.2.0.)
  5. Any other considerations when choosing/recommending Linux set-ups for r-spatial?
@cboettig
Copy link

cboettig commented Mar 2, 2020

Thanks @Robinlovelace ! Just a heads up that on the rocker-versioned stack, we will likely be moving to Ubuntu-based images starting with R 4.0.0 (and thus probably starting with Ubuntu 20.04 LTS, possibly with opt in support to build on 18.04 too since it will be around a while!). Our default will be the upstream Ubuntu LTS libraries for GEOS, GDAL, PROJ, etc, but we are also testing out an approach that should make it easier to generate build matrices with different versions of most key stuff on demand.

More on this to come, but knowing what is useful from the geospatial community would be great.

We're also hoping this modular approach will make it easier not only to re-stack the images (say, rocker/geospatial sans rstudio etc), but the geospatial install script should be reasonably portable within the confines of the same Ubuntu LTS version (e.g. given that most CI platforms etc frequently offer ubuntu base images but not debian, we're hoping this will create more shared infrastructure).

As you know, one of our goals is to provide images that can be rebuilt consistently (bugs and all) with fixed versions of packages & libraries, which is why the rocker-versioned stack tends to avoid PPAs (particularly those pinned to rolling states, some PPAs are more static).

Anyway, apologies these aren't direct answers to your questions, but thanks for flagging this thread and we'll keep an eye on it!

(cc @noamross)

@Robinlovelace
Copy link
Contributor Author

Robinlovelace commented Mar 2, 2020

Many thanks for the heads-up and detailed response @cboettig. Reading between the lines I see answers to at least 2 of the questions in there:

2. A disadvantage of repositories is that they are not stable over time, damaging reproducibility.

4. rocker/geospatial is actively maintained and should be relatively future-proof and adaptable

@gisma
Copy link
Member

gisma commented Mar 3, 2020

Thanks @Robinlovelace for rising this topic wich is driving me crazy since years trying to find meaningful workarounds within the university ecosystem to run courses for open source software supported education. Beginning with togo R-GIS software bundles, customized Linux distributions, Linux ISO downloads up to "usability" packages like link2GI however all variants were extremely labor-intensive and still not satisfying.

By chance we met last week with Hanna Meyer's (@HannaMeyer) working group in Münster where one of the meeting topics was focusing your question. As a short summary @DaChro (from Münster Geoinformatics) and I want to compile and use docker images for future courses.

For me, the question is what is the most efficient approach?
Does it make sense in this discussion to first define requirements or alternatively to learn by doing and simply develop the first containers?

Regarding this issue I am more than willing to contribute my experience and wishes. As far as docker images are concerned, however, I'm currently at a low level... I'm happy about any support with this and I'm very willing to get involved as well. Regarding the pros and cons I think it makes sense to define first the request what "geospatial demands" are focussed. End of the week I have more time to provide a list of our typical needs in Marburg maybe this could be an starting point.

@Robinlovelace
Copy link
Contributor Author

Robinlovelace commented Mar 14, 2020

Just as a follow-up I have tested installing R from scratch on the latest non-LTS version of Ubuntu (19.10) and am very happy to say that it took only ~1 minute to set-up and install, including installing of upstream dependencies, with the following commands on my new developer laptop from Linux hardware company Entroware, which has a reasonable (10th gen intel CPU) spec 🎉

Having just tested it it seems to work fine with the following upstream libraries in Ubuntu's default repos (I'm not sure if these will need to be updated):

library(sf)
Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0

Here are the bash commands I entered to get this set-up working, in case it's of use/interest to others (adapted from Ubuntu install page on CRAN):

sudo -i
echo 'deb https://cloud.r-project.org/bin/linux/ubuntu eoan-cran35/' >> /etc/apt/sources.list 
su robin
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo apt update
sudo apt install r-base-dev 
sudo apt install r-cran-sf 

# The following NEW packages will be installed
#   fonts-font-awesome fonts-glyphicons-halflings fonts-mathjax gdal-data javascript-common libaec0
#   libarmadillo9 libarpack2 libc-ares2 libcharls2 libdap25 libdapclient6v5 libepsilon1 libfreexl1
#   libfyba0 libgdal20 libgeos-3.7.2 libgeos-c1v5 libgeotiff2 libhdf4-0-alt libhdf5-103
# ...

@Robinlovelace
Copy link
Contributor Author

One question for the community: are these versions of upstream dependency versions future proof?:

Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0

@Robinlovelace
Copy link
Contributor Author

Robinlovelace commented Mar 14, 2020

@gisma in answer to your question

For me, the question is what is the most efficient approach?

I think installing R on a recent version of Ubuntu as outlined above will be the most time efficient solution for installing R-spatial packages for many people. With docker rocker/geospatial should do the trick, although I suspect that installing and setting up Docker could slow that solution down if you're new to Docker...

@edzer
Copy link
Member

edzer commented Mar 14, 2020

Hi Robin. I don't understand what you exactly mean by "future proof". For instance, sf requires that GDAL >= 2.0.1. That is the lowest version I managed to compile against (to be precise: I couldn't get 2.0.0 installed because of a bug that was fixed in 2.0.1). For all x.y.z versions you should always choose the highest z, because these were bug fixes. The released PROJ 6.3.0 contained errors that affected R packages (sf included), and that was fixed in 6.3.1. rgdal goes back further in time because it was developed much longer ago. GDAL 2.x was never meant to be used with PROJ 6.x but this was discovered after the fact (I believe) and created a big chaos on CRAN test machines a few months ago when debian-testing distributed that combination. I don't know what is, or will be, future proof, but we try.

In the future, we (hopefully) will always recommend (and help) users to work with recent stable releases. We should work on making testing easier so we find problems earlier. You could for instance run all your book code in an docker image with all most recent released versions of GDAL, GEOS and PROJ and CRAN packages, checking all output diffs - Roger does a similar thing with ASDAR and R package CRAN updates.

@Robinlovelace
Copy link
Contributor Author

Robinlovelace commented Mar 14, 2020

By future proof I meant that they are likely to work with future releases of r-spatial packages for the next few years. Based on that definition and your response, it seems to me that the latest versions available from Ubuntu's default repositories, GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0 are future-proof 🚀 👍 🎸

In the future, we (hopefully) will always recommend (and help) users to work with recent stable releases. We should work on making testing easier so we find problems earlier. You could for instance run all your book code in an docker image with all most recent released versions of GDAL, GEOS and PROJ and CRAN packages, checking all output diffs - Roger does a similar thing with ASDAR and R package CRAN updates.

This is massively appreciated and I fully agree about tests and test environments. I have created a new issue in the geocompr repo to track efforts towards tests on upstream OSGEO dependencies there (for a while we joked that Geocompr was an unofficial test suite for tmap and it would be great if we can continue to support development,of other R packages even if it is with the rather unglamorous role of diagnosing bugs / asking questions ; ) geocompx/geocompr#476

I've had a brief look at the multiple Dockerfile tags and test matrices used by the likes of QGIS (which is failing many tests at the time of writing) and PostGIS (with multiple tests using multiple upstream versions in Travis, seriously impressive, shown here) and definitely think this kind of thing is worth doing, those other projects provide inspiration and some example code that may be useful!

@Robinlovelace
Copy link
Contributor Author

Hi all, I've drafted a blog post on this: https://github.com/geocompr/geocompr.github.io/blob/installing-on-linux/content/post/2020/installing-r-spatial-packages-linux.Rmd

It started out being about installing R-spatial on Linux but I found that too broad so have focussed it down on Ubuntu which I understand better and am more confident to comment on. Comments/suggestions very welcome.

@Robinlovelace
Copy link
Contributor Author

For anyone watching this thread, here is the final version. Any suggested changes welcome before I put this out there: https://geocompr.github.io/post/2020/installing-r-spatial-ubuntu/

@florisvdh
Copy link
Member

Hi @Robinlovelace, nice to see this tutorial! I don't know whether you want to involve QGIS and GRASS, but here's a problem that existed when using ppa:ubuntugis/ubuntugis-unstable about a month ago, although I expect this is now solved. Maybe you find it too specific, but anyway here it is.

The problem was encountered on Focal (my 'flavour' is Linux Mint 😸 ). The issue arose when trying to install qgis-plugin-grass with QGIS being installed from the qgis repo for Ubuntu (https://qgis.org/ubuntugis) + GRASS from ppa:ubuntugis/ubuntugis-unstable. The thing is, QGIS Focal packages were still missing at that time from the QGIS ubuntugis repo (https://qgis.org/ubuntugis), while the plugin in the regular QGIS ubuntu repo (https://qgis.org/ubuntu), which did have Focal packages, depended on a previous GRASS version (7.8.2) than GRASS on the PPA. This gave a dependency version conflict.

So, unless one is prepared to build from source, I think what can be concluded is that the use of the PPA may be a limitation in installing other geospatial tools, apparently during some months in the case of new LTS releases. At the time I remedied this by disabling the unstable PPA and accepting to use GRASS etc from Ubuntu itself. (7.8.2)

Now it seems that Focal packages have come up in the qgis ubuntugis repo 👍 , so no more problem with the PPA I expect.

@Robinlovelace
Copy link
Contributor Author

Hi @florisvdh thanks for raising this. I have noticed that there have only been 3 packages on the ubuntugis-unstable for some time and that still seems to be the case: https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis-unstable?field.series_filter=focal

I have also noticed issues with QGIS and focal. Reading your message and the info at https://www.qgis.org/en/site/forusers/alldownloads.html#repositories it seems like https://qgis.org/ubuntu for the latest release and https://qgis.org/debian-ltr for the LTR are the way to go for stability on Focal Fossa. I guess the advice to use ppa:ubuntugis/ubuntugis-unstable is no longer sensible for Focal Fossa, would you agree?

Note there is also a long thread discussing this in the context of the rocker/geospatial Docker image rocker-org/geospatial#31. I would like to use this as the basis of the geocompr/geocompr image described here: https://github.com/geocompr/docker so very interested to hear of other experiences. Overall, I think the Focal Fossa default repos are sufficiently up-to-date and stable, but I would suggest keeping an eye on this ppa also (still QGIS/GDAL it seems): https://launchpad.net/~osgeolive/+archive/ubuntu/nightly?field.series_filter=focal

@florisvdh
Copy link
Member

florisvdh commented Sep 21, 2020

Thanks for these pointers @Robinlovelace. Indeed, I had only looked at the qgis-grass dependency problem. GRASS for Focal is maintained on the unstable PPA, but overall Focal packages are almost absent there (I didn't realize that; thanks for emphasizing!). Then, it's a pitty that it's currently easier to get PROJ 7 on Bionic than on Focal.

I guess the advice to use ppa:ubuntugis/ubuntugis-unstable is no longer sensible for Focal Fossa, would you agree?

I agree that currently, there's no much gain in using it except for GRASS (7.8.2 -> 7.8.3), with QGIS (from their own ubuntugis repo) supporting the unstable PPA. IMO it will not harm, either, to add it: if one day more recent Focal packages do get added at the unstable PPA, they will be preferred by the update manager; if not, it will not impact the system and the Ubuntu repo will be used instead.

Just successfully updated QGIS & GRASS to work with unstable PPA:

afbeelding

$ export LANGUAGE="en_US:en"
$ apt policy grass
grass:
  Installed: 7.8.3-1~focal1
  Candidate: 7.8.3-1~focal1
  Version table:
 *** 7.8.3-1~focal1 500
        500 http://ppa.launchpad.net/ubuntugis/ubuntugis-unstable/ubuntu focal/main amd64 Packages
        500 http://ppa.launchpad.net/ubuntugis/ubuntugis-unstable/ubuntu focal/main i386 Packages
        100 /var/lib/dpkg/status
     7.8.2-1build3 500
        500 http://ftp.belnet.be/ubuntu focal/universe amd64 Packages
        500 http://ftp.belnet.be/ubuntu focal/universe i386 Packages
$ apt policy qgis
qgis:
  Installed: 1:3.14.16+32focal-ubuntugis
  Candidate: 1:3.14.16+32focal-ubuntugis
  Version table:
 *** 1:3.14.16+32focal-ubuntugis 500
        500 https://qgis.org/ubuntugis focal/main amd64 Packages
        100 /var/lib/dpkg/status
     3.10.9+dfsg-1~focal1 500
        500 http://ppa.launchpad.net/ubuntugis/ubuntugis-unstable/ubuntu focal/main amd64 Packages
     3.10.4+dfsg-1ubuntu2 500
        500 http://ftp.belnet.be/ubuntu focal/universe amd64 Packages

@florisvdh
Copy link
Member

Splendid news regarding the ubuntugis-unstable PPA on Focal! Many new packages were added, by Angelos Tzotsos. This means an upgrade of below software at least.

afbeelding

For several libraries these versions currently exceed those for Bionic.

@florisvdh
Copy link
Member

Please take note (message from the Ubuntu OSGeo mailing list):

Hi,

The transition is now over, please test the new packages in experimental
so we can move them to unstable soon.
https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis-experimental/+packages?field.name_filter=&field.status_filter=published&field.series_filter=

Best,
Angelos

On 12/22/21 1:24 PM, Angelos Tzotsos wrote:

Hi all,

I have started porting the latest GEOS, PROJ and GDAL to experimental
ppa for Focal.
Once this is completed, I will drop a note here before pushing to
Unstable.

Best,
Angelos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants