Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

st_simplify produces error when specifying tolerance values per simple feature on S2 code path #2442

Closed
goergen95 opened this issue Sep 18, 2024 · 2 comments

Comments

@goergen95
Copy link
Contributor

goergen95 commented Sep 18, 2024

Describe the bug
I am trying to simplify polygons with a tolerance value based on the number of vertices per simple feature.
On the $S^2$ code path, this results in an error because s2::s2_simplify() seems to expect a single value for the tolerance parameter. However, the docs for sf::st_simplify() state:

dTolerance numeric; tolerance parameter, specified for all or for each feature geometry. [..]

To Reproduce

library(sf)
#> Linking to GEOS 3.12.2, GDAL 3.9.1, PROJ 9.4.1; sf_use_s2() is TRUE

sf_use_s2(TRUE)  

nc <- read_sf(system.file("shape/nc.shp", package="sf"))
nc <- st_transform(nc, st_crs("EPSG:4326"))
vertices <- as.numeric(table(st_coordinates(nc)[ ,5]))
nc2 <- st_simplify(nc, dTolerance = round(vertices/10))
#> Error in if (snap_radius > 3) {: the condition has length > 1

Created on 2024-09-18 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.1 (2024-06-14)
#>  os       Ubuntu 22.04.4 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Etc/UTC
#>  date     2024-09-18
#>  pandoc   3.2 @ /usr/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  class         7.3-22  2023-05-03 [2] CRAN (R 4.4.1)
#>  classInt      0.4-10  2023-09-05 [1] CRAN (R 4.4.1)
#>  cli           3.6.3   2024-06-21 [1] RSPM (R 4.4.0)
#>  DBI           1.2.3   2024-06-02 [1] RSPM (R 4.4.0)
#>  digest        0.6.37  2024-08-19 [1] RSPM (R 4.4.0)
#>  e1071         1.7-14  2023-12-06 [1] CRAN (R 4.4.1)
#>  evaluate      0.24.0  2024-06-10 [1] RSPM (R 4.4.0)
#>  fansi         1.0.6   2023-12-08 [1] RSPM (R 4.4.0)
#>  fastmap       1.2.0   2024-05-15 [1] RSPM (R 4.4.0)
#>  fs            1.6.4   2024-04-25 [1] RSPM (R 4.4.0)
#>  glue          1.7.0   2024-01-09 [1] RSPM (R 4.4.0)
#>  htmltools     0.5.8.1 2024-04-04 [1] RSPM (R 4.4.0)
#>  KernSmooth    2.23-24 2024-05-17 [2] CRAN (R 4.4.1)
#>  knitr         1.48    2024-07-07 [1] RSPM (R 4.4.0)
#>  lifecycle     1.0.4   2023-11-07 [1] RSPM (R 4.4.0)
#>  magrittr      2.0.3   2022-03-30 [1] RSPM (R 4.4.0)
#>  pillar        1.9.0   2023-03-22 [1] RSPM (R 4.4.0)
#>  pkgconfig     2.0.3   2019-09-22 [1] RSPM (R 4.4.0)
#>  proxy         0.4-27  2022-06-09 [1] CRAN (R 4.4.1)
#>  Rcpp          1.0.13  2024-07-17 [1] RSPM (R 4.4.0)
#>  reprex        2.1.0   2024-01-11 [1] RSPM (R 4.4.0)
#>  rlang         1.1.4   2024-06-04 [1] RSPM (R 4.4.0)
#>  rmarkdown     2.28    2024-08-17 [1] RSPM (R 4.4.0)
#>  rstudioapi    0.16.0  2024-03-24 [1] RSPM (R 4.4.0)
#>  s2            1.1.7   2024-07-17 [1] CRAN (R 4.4.1)
#>  sessioninfo   1.2.2   2021-12-06 [1] RSPM (R 4.4.0)
#>  sf          * 1.0-17  2024-08-29 [1] Github (r-spatial/sf@e5bce2e)
#>  tibble        3.2.1   2023-03-20 [1] RSPM (R 4.4.0)
#>  units         0.8-5   2023-11-28 [1] CRAN (R 4.4.1)
#>  utf8          1.2.4   2023-10-22 [1] RSPM (R 4.4.0)
#>  vctrs         0.6.5   2023-12-01 [1] RSPM (R 4.4.0)
#>  withr         3.0.1   2024-07-31 [1] RSPM (R 4.4.0)
#>  wk            0.9.2   2024-07-09 [1] CRAN (R 4.4.1)
#>  xfun          0.47    2024-08-17 [1] RSPM (R 4.4.0)
#>  yaml          2.3.10  2024-07-26 [1] RSPM (R 4.4.0)
#> 
#>  [1] /usr/local/lib/R/site-library
#>  [2] /usr/local/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
goergen95 added a commit to goergen95/sf that referenced this issue Sep 24, 2024
@edzer
Copy link
Member

edzer commented Sep 28, 2024

The PR looks sensible at first sight, but I wonder whether s2 does the simplification feature by feature (that is, does doing the loop in R instead of in s2 always result in the same output), and whether there is a performance loss. I would feel more comfortable to only use the mapply construct when dTolerance has a length > 1 (although the code is somewhat longer, then).

goergen95 added a commit to goergen95/sf that referenced this issue Oct 1, 2024
@goergen95
Copy link
Contributor Author

I adjusted the PR as you suggested. However, both of your hunches turned out to be true:
using multiple values for dTolerance on the s2 code path using mapply is very inefficient and does not return identical results. Not sure if there is much to do here on the R side and if it is worth to open an issue upstream for this feature?
If not, I could make this PR about changing sf documentation to make users aware that supplying multiple values for dTolerance when s2 is switched on is not allowed and maybe issue a warning?

Code Example
remotes::install_github("goergen95/sf", ref = "issue2442")
#> Skipping install of 'sf' from a github remote, the SHA1 (49b5fbd1) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(sf)
#> Linking to GEOS 3.13.0, GDAL 3.9.2, PROJ 9.5.0; sf_use_s2() is TRUE
library(microbenchmark)

nc <- read_sf(system.file("shape/nc.shp", package="sf"))
nc <- st_transform(nc, st_crs("EPSG:4326"))

sf_use_s2(TRUE)  
microbenchmark(
    single_tolerance = {nc1 <- st_simplify(nc, dTolerance = 2500)},
    multi_tolerance = {nc2 <- st_simplify(nc, dTolerance = rep(2500, nrow(nc)))},
    times = 100
)
#> Unit: milliseconds
#>              expr      min        lq      mean    median        uq       max
#>  single_tolerance 18.02828  20.66847  24.17936  23.15384  27.14953  53.99792
#>   multi_tolerance 95.44030 104.83197 111.95288 111.30223 115.69527 162.41727
#>  neval cld
#>    100  a 
#>    100   b
identical(nc1, nc2)
#> [1] FALSE

sf_use_s2(FALSE)
#> Spherical geometry (s2) switched off
suppressWarnings({
    microbenchmark(
        single_tolerance = {nc3 <- st_simplify(nc, dTolerance = 0.025)},
        multi_tolerance = {nc4 <- st_simplify(nc, dTolerance = rep(0.025, nrow(nc)))},
        times = 100
    )})
#> Unit: milliseconds
#>              expr      min       lq     mean   median       uq      max neval
#>  single_tolerance 2.358385 2.825957 3.870791 3.612138 4.845731 6.692318   100
#>   multi_tolerance 2.316221 2.775396 3.899538 3.819797 4.909197 7.329169   100
#>  cld
#>    a
#>    a
identical(nc3, nc4)
#> [1] TRUE

Created on 2024-10-01 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.1 (2024-06-14)
#>  os       Ubuntu 22.04.5 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Etc/UTC
#>  date     2024-10-01
#>  pandoc   3.2 @ /usr/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package        * version  date (UTC) lib source
#>  class            7.3-22   2023-05-03 [2] CRAN (R 4.4.1)
#>  classInt         0.4-10   2023-09-05 [1] CRAN (R 4.4.1)
#>  cli              3.6.3    2024-06-21 [1] RSPM (R 4.4.0)
#>  codetools        0.2-20   2024-03-31 [2] CRAN (R 4.4.1)
#>  curl             5.2.3    2024-09-20 [1] RSPM (R 4.4.0)
#>  DBI              1.2.3    2024-06-02 [1] RSPM (R 4.4.0)
#>  digest           0.6.37   2024-08-19 [1] RSPM (R 4.4.0)
#>  e1071            1.7-16   2024-09-16 [1] CRAN (R 4.4.1)
#>  evaluate         1.0.0    2024-09-17 [1] RSPM (R 4.4.0)
#>  fansi            1.0.6    2023-12-08 [1] RSPM (R 4.4.0)
#>  fastmap          1.2.0    2024-05-15 [1] RSPM (R 4.4.0)
#>  fs               1.6.4    2024-04-25 [1] RSPM (R 4.4.0)
#>  glue             1.7.0    2024-01-09 [1] RSPM (R 4.4.0)
#>  htmltools        0.5.8.1  2024-04-04 [1] RSPM (R 4.4.0)
#>  KernSmooth       2.23-24  2024-05-17 [2] CRAN (R 4.4.1)
#>  knitr            1.48     2024-07-07 [1] RSPM (R 4.4.0)
#>  lattice          0.22-6   2024-03-20 [2] CRAN (R 4.4.1)
#>  lifecycle        1.0.4    2023-11-07 [1] RSPM (R 4.4.0)
#>  magrittr         2.0.3    2022-03-30 [1] RSPM (R 4.4.0)
#>  MASS             7.3-60.2 2024-04-26 [2] CRAN (R 4.4.1)
#>  Matrix           1.7-0    2024-04-26 [2] CRAN (R 4.4.1)
#>  microbenchmark * 1.5.0    2024-09-04 [1] CRAN (R 4.4.1)
#>  multcomp         1.4-26   2024-07-18 [1] RSPM (R 4.4.0)
#>  mvtnorm          1.3-1    2024-09-03 [1] RSPM (R 4.4.0)
#>  pillar           1.9.0    2023-03-22 [1] RSPM (R 4.4.0)
#>  pkgconfig        2.0.3    2019-09-22 [1] RSPM (R 4.4.0)
#>  proxy            0.4-27   2022-06-09 [1] CRAN (R 4.4.1)
#>  purrr            1.0.2    2023-08-10 [1] RSPM (R 4.4.0)
#>  R.cache          0.16.0   2022-07-21 [1] RSPM (R 4.4.0)
#>  R.methodsS3      1.8.2    2022-06-13 [1] RSPM (R 4.4.0)
#>  R.oo             1.26.0   2024-01-24 [1] RSPM (R 4.4.0)
#>  R.utils          2.12.3   2023-11-18 [1] RSPM (R 4.4.0)
#>  Rcpp             1.0.13   2024-07-17 [1] RSPM (R 4.4.0)
#>  remotes          2.5.0    2024-03-17 [1] RSPM (R 4.4.0)
#>  reprex           2.1.0    2024-01-11 [1] RSPM (R 4.4.0)
#>  rlang            1.1.4    2024-06-04 [1] RSPM (R 4.4.0)
#>  rmarkdown        2.28     2024-08-17 [1] RSPM (R 4.4.0)
#>  rstudioapi       0.16.0   2024-03-24 [1] RSPM (R 4.4.0)
#>  s2               1.1.7    2024-07-17 [1] CRAN (R 4.4.1)
#>  sandwich         3.1-1    2024-09-15 [1] CRAN (R 4.4.1)
#>  sessioninfo      1.2.2    2021-12-06 [1] RSPM (R 4.4.0)
#>  sf             * 1.0-18   2024-10-01 [1] Github (goergen95/sf@49b5fbd)
#>  styler           1.10.3   2024-04-07 [1] RSPM (R 4.4.0)
#>  survival         3.6-4    2024-04-24 [2] CRAN (R 4.4.1)
#>  TH.data          1.1-2    2023-04-17 [1] RSPM (R 4.4.0)
#>  tibble           3.2.1    2023-03-20 [1] RSPM (R 4.4.0)
#>  units            0.8-5    2023-11-28 [1] CRAN (R 4.4.1)
#>  utf8             1.2.4    2023-10-22 [1] RSPM (R 4.4.0)
#>  vctrs            0.6.5    2023-12-01 [1] RSPM (R 4.4.0)
#>  withr            3.0.1    2024-07-31 [1] RSPM (R 4.4.0)
#>  wk               0.9.3    2024-09-06 [1] CRAN (R 4.4.1)
#>  xfun             0.47     2024-08-17 [1] RSPM (R 4.4.0)
#>  yaml             2.3.10   2024-07-26 [1] RSPM (R 4.4.0)
#>  zoo              1.8-12   2023-04-13 [1] CRAN (R 4.4.1)
#> 
#>  [1] /usr/local/lib/R/site-library
#>  [2] /usr/local/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

edzer added a commit that referenced this issue Oct 1, 2024
@edzer edzer closed this as completed Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants