Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement promote_to_multi when converting WKB to sfc #2369

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

paleolimbot
Copy link
Contributor

@paleolimbot paleolimbot commented Mar 31, 2024

Apologies for taking so long to circle back here, but this is one approach to ensuring that R CMD check passes when R_SF_ST_READ_USE_STREAM=true. It also has the nice side-effect that use_stream = TRUE no longer uses the wk package to construct sfc objects and that all users of st_as_sfc(<WKB>) can now use promote_to_multi = TRUE to get the same read-ogr-like behaviour.

Closes #2296.

Example:

library(sf)

vec <- st_sfc(
    st_point(c(0, 1)),
    st_point(),
    st_point(c(2, 3)),
    st_multipoint(matrix(c(4, 5), ncol = 2))
)

wkb_vec <- st_as_binary(vec)

st_as_sfc(wkb_vec, promote_to_multi = FALSE)
st_as_sfc(wkb_vec, promote_to_multi = TRUE)

I do still see:

Failure (test-tm.R:20:3): st_read and write handle date and time
x[["tm"]] not equal to x2[["tm"]].
Attributes: < Component “tzone”: 1 string mismatch >

Failure (test-tm.R:40:3): st_read and write handle date and time
x[["tm"]] not equal to x2[["tm"]].
Attributes: < Component “tzone”: 1 string mismatch >

...when running testthat::test_local() with R_SF_ST_READ_USE_STREAM=true. I think this happens because of how nanoarrow converts timestamps without an explicit timezone to R objects: nanoarrow assigns UTC as opposed to setting tzone = "" or omitting it. I copied this behaviour from readr because it is more reproducible between systems, but I'm not sure it's any more or less correct.

@paleolimbot paleolimbot changed the title Implement promote_multi when converting WKB to sfc Implement promote_to_multi when converting WKB to sfc Mar 31, 2024
@paleolimbot paleolimbot marked this pull request as ready for review March 31, 2024 20:26
@edzer
Copy link
Member

edzer commented Jul 23, 2024

There are still several tests failing for me on this PR:

R CMD build sf --no-build-vignettes --resave-data
* checking for file ‘sf/DESCRIPTION’ ... OK
* preparing ‘sf’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* running ‘cleanup’
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘sf_1.0-17.tar.gz’

_R_CHECK_TESTS_NLINES_=0 USER=CRAN R CMD check sf_*gz
* using log directory ‘/home/edzer/git/sf.Rcheck’
* using R version 4.4.0 (2024-04-24)
* using platform: x86_64-pc-linux-gnu
* R was compiled by
    gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
    GNU Fortran (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
* running under: Ubuntu 22.04.4 LTS
* using session charset: UTF-8
* checking for file ‘sf/DESCRIPTION’ ... OK
* this is package ‘sf’ version ‘1.0-17’
* package encoding: UTF-8
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘sf’ can be installed ... OK
* used C++ compiler: ‘g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0’
* checking installed package size ... NOTE
  installed size is 21.6Mb
  sub-directories of 1Mb or more:
    libs    16.8Mb
    sqlite   1.5Mb
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ...Warning: program compiled against libxml 210 using older 209
 OK
* checking code files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking whether startup messages can be suppressed ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking line endings in shell scripts ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking line endings in Makefiles ... OK
* checking compilation flags in Makevars ... OK
* checking for GNU extensions in Makefiles ... OK
* checking for portable use of $(BLAS_LIBS) and $(LAPACK_LIBS) ... OK
* checking use of PKG_*FLAGS in Makefiles ... OK
* checking compiled code ... OK
* checking files in ‘vignettes’ ... WARNING
Files in the 'vignettes' directory but no files in 'inst/doc':
  ‘sf.fig’ ‘sf1.Rmd’ ‘sf2.Rmd’ ‘sf2.png’ ‘sf3.Rmd’ ‘sf4.Rmd’ ‘sf5.Rmd’
  ‘sf6.Rmd’ ‘sf7.Rmd’ ‘sf_xfig.png’
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘aggregate.R’
  Comparing ‘aggregate.Rout’ to ‘aggregate.Rout.save’ ... OK
  Running ‘cast.R’
  Comparing ‘cast.Rout’ to ‘cast.Rout.save’ ... OK
  Running ‘crs.R’
  Comparing ‘crs.Rout’ to ‘crs.Rout.save’ ... OK
  Running ‘dist.R’
  Comparing ‘dist.Rout’ to ‘dist.Rout.save’ ... OK
  Running ‘dplyr.R’
  Comparing ‘dplyr.Rout’ to ‘dplyr.Rout.save’ ... OK
  Running ‘empty.R’
  Comparing ‘empty.Rout’ to ‘empty.Rout.save’ ... OK
  Running ‘gdal_geom.R’
  Comparing ‘gdal_geom.Rout’ to ‘gdal_geom.Rout.save’ ... OK
  Running ‘geos.R’
  Comparing ‘geos.Rout’ to ‘geos.Rout.save’ ... OK
  Running ‘graticule.R’
  Comparing ‘graticule.Rout’ to ‘graticule.Rout.save’ ... OK
  Running ‘grid.R’
  Comparing ‘grid.Rout’ to ‘grid.Rout.save’ ... OK
  Running ‘maps.R’
  Comparing ‘maps.Rout’ to ‘maps.Rout.save’ ... OK
  Running ‘plot.R’
  Comparing ‘plot.Rout’ to ‘plot.Rout.save’ ... OK
  Running ‘read.R’
  Comparing ‘read.Rout’ to ‘read.Rout.save’ ...39c39
< [1] 1028678842
---
> [1] "1028678842"
214a215,216
> Integer64 values larger than 9.0072e+15 lost significance after conversion to double;
> use argument int64_as_string = TRUE to import them lossless, as character
228c230
< 1 1 4.611686e+18 POINT (0 1)
---
> 1 1 4611686018427387904 POINT (0 1)
259c261
< [1] FALSE
---
> [1] TRUE
  Running ‘roundtrip.R’
  Comparing ‘roundtrip.Rout’ to ‘roundtrip.Rout.save’ ... OK
  Running ‘s2.R’
  Comparing ‘s2.Rout’ to ‘s2.Rout.save’ ... OK
  Running ‘sample.R’
  Comparing ‘sample.Rout’ to ‘sample.Rout.save’ ... OK
  Running ‘sfc.R’
  Comparing ‘sfc.Rout’ to ‘sfc.Rout.save’ ...421d420
<   .. ..- attr(*, "class")= chr [1:3] "XY" "POLYGON" "sfg"
469c468
<   User input: GEOGCS["NAD27",DATUM["North_American_Datum_1927",SPHEROID["Clarke 1866",6378206.4,294.978698213898,AUTHORITY["EPSG","7008"]],AUTHORITY["EPSG","6267"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4267"]] 
---
>   User input: NAD27 
478c477
<         AXIS["geodetic latitude (Lat)",north,
---
>         AXIS["latitude",north,
481c480
<         AXIS["geodetic longitude (Lon)",east,
---
>         AXIS["longitude",east,
  Running ‘sfg.R’
  Comparing ‘sfg.Rout’ to ‘sfg.Rout.save’ ... OK
  Running ‘spatstat.R’
  Comparing ‘spatstat.Rout’ to ‘spatstat.Rout.save’ ... OK
  Running ‘stars.R’
  Comparing ‘stars.Rout’ to ‘stars.Rout.save’ ... OK
  Running ‘testthat.R’
 ERROR
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
  > if (require(testthat, quietly = TRUE)) {
  +  suppressPackageStartupMessages(library(sf))
  +  test_check("sf")
  + }
  Coordinate Reference System:
    User input: EPSG:4326 
    wkt:
  GEOGCRS["WGS 84",
      ENSEMBLE["World Geodetic System 1984 ensemble",
          MEMBER["World Geodetic System 1984 (Transit)"],
          MEMBER["World Geodetic System 1984 (G730)"],
          MEMBER["World Geodetic System 1984 (G873)"],
          MEMBER["World Geodetic System 1984 (G1150)"],
          MEMBER["World Geodetic System 1984 (G1674)"],
          MEMBER["World Geodetic System 1984 (G1762)"],
          MEMBER["World Geodetic System 1984 (G2139)"],
          ELLIPSOID["WGS 84",6378137,298.257223563,
              LENGTHUNIT["metre",1]],
          ENSEMBLEACCURACY[2.0]],
      PRIMEM["Greenwich",0,
          ANGLEUNIT["degree",0.0174532925199433]],
      CS[ellipsoidal,2],
          AXIS["geodetic latitude (Lat)",north,
              ORDER[1],
              ANGLEUNIT["degree",0.0174532925199433]],
          AXIS["geodetic longitude (Lon)",east,
              ORDER[2],
              ANGLEUNIT["degree",0.0174532925199433]],
      USAGE[
          SCOPE["Horizontal component of 3D system."],
          AREA["World."],
          BBOX[-90,-180,90,180]],
      ID["EPSG",4326]]
  Coordinate Reference System:
    No user input
    wkt:
  GEOGCRS["WGS 84",
      ENSEMBLE["World Geodetic System 1984 ensemble",
          MEMBER["World Geodetic System 1984 (Transit)"],
          MEMBER["World Geodetic System 1984 (G730)"],
          MEMBER["World Geodetic System 1984 (G873)"],
          MEMBER["World Geodetic System 1984 (G1150)"],
          MEMBER["World Geodetic System 1984 (G1674)"],
          MEMBER["World Geodetic System 1984 (G1762)"],
          MEMBER["World Geodetic System 1984 (G2139)"],
          ELLIPSOID["WGS 84",6378137,298.257223563,
              LENGTHUNIT["metre",1]],
          ENSEMBLEACCURACY[2.0]],
      PRIMEM["Greenwich",0,
          ANGLEUNIT["degree",0.0174532925199433]],
      CS[ellipsoidal,2],
          AXIS["geodetic latitude (Lat)",north,
              ORDER[1],
              ANGLEUNIT["degree",0.0174532925199433]],
          AXIS["geodetic longitude (Lon)",east,
              ORDER[2],
              ANGLEUNIT["degree",0.0174532925199433]],
      USAGE[
          SCOPE["Horizontal component of 3D system."],
          AREA["World."],
          BBOX[-90,-180,90,180]],
      ID["EPSG",4326]]
  Cannot open layer foo
  Reading layer `nospatial' from data source 
    `/home/edzer/git/sf.Rcheck/sf/gpkg/nospatial.gpkg' using driver `GPKG'
  Reading layer `nospatial' from data source 
    `/home/edzer/git/sf.Rcheck/sf/gpkg/nospatial.gpkg' using driver `GPKG'
  Reading layer `nc' from data source `/home/edzer/git/sf.Rcheck/sf/shape/nc.shp' using driver `ESRI Shapefile'
  Simple feature collection with 100 features and 14 fields
  Geometry type: MULTIPOLYGON
  Dimension:     XY
  Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
  Geodetic CRS:  NAD27
  OGR: Unsupported geometry type
  Failed to create feature 1 in x
  Failed to create feature 1 in x
  [ FAIL 4 | WARN 4 | SKIP 43 | PASS 884 ]
  
  ══ Skipped tests (43) ══════════════════════════════════════════════════════════
  • On CRAN (1): 'test-gdal.R:101:5'
  • Sys.getenv("USER") %in% c("edzer", "travis") is not TRUE (3):
    'test-gdal.R:55:3', 'test-write.R:47:3', 'test-write.R:130:3'
  • could not connect to 'empty' database (1): 'test-postgis_RPostgres.R:362:5'
  • could not connect to postgis database (25): 'test-postgis_ODBC.R:33:2',
    'test-postgis_ODBC.R:46:2', 'test-postgis_ODBC.R:68:2',
    'test-postgis_ODBC.R:80:2', 'test-postgis_ODBC.R:103:2',
    'test-postgis_ODBC.R:124:2', 'test-postgis_ODBC.R:137:2',
    'test-postgis_ODBC.R:168:2', 'test-postgis_ODBC.R:191:2',
    'test-postgis_ODBC.R:252:2', 'test-postgis_ODBC.R:267:2',
    'test-postgis_ODBC.R:281:2', 'test-postgis_RPostgreSQL.R:33:2',
    'test-postgis_RPostgreSQL.R:46:2', 'test-postgis_RPostgreSQL.R:69:2',
    'test-postgis_RPostgreSQL.R:84:2', 'test-postgis_RPostgreSQL.R:96:2',
    'test-postgis_RPostgreSQL.R:120:2', 'test-postgis_RPostgreSQL.R:141:2',
    'test-postgis_RPostgreSQL.R:154:2', 'test-postgis_RPostgreSQL.R:194:2',
    'test-postgis_RPostgreSQL.R:217:2', 'test-postgis_RPostgreSQL.R:278:2',
    'test-postgis_RPostgreSQL.R:302:2', 'test-postgis_RPostgreSQL.R:316:2'
  • empty test (7): 'test-crs.R:66:1', 'test-crs.R:99:1',
    'test-normalize.R:29:1', 'test-proj.R:1:1', 'test-read.R:7:1',
    'test-s2.R:20:1', 'test-sf.R:52:1'
  • sf_extSoftVersion()[["GDAL"]] < "2.5.0" && sf_extSoftVersion()[["proj.4"]] <
    (1): 'test-crs.R:61:3'
  • sf_extSoftVersion()[["GDAL"]] >= "2.5.0" is TRUE (1): 'test-gdal.R:46:3'
  • sf_extSoftVersion()[["proj.4"]] >= "6.0.0" is TRUE (3): 'test-crs.R:44:3',
    'test-crs.R:49:3', 'test-crs.R:55:3'
  • sf_use_s2() is TRUE (1): 'test-geos.R:20:5'
  
  ══ Failed tests ════════════════════════════════════════════════════════════════
  ── Failure ('test-postgis_RPostgres.R:87:5'): can handle multiple geom columns ──
  st_crs(x[["geometry.1"]]) == st_crs(multi[["geometry.1"]]) is not TRUE
  
  `actual`:   FALSE
  `expected`: TRUE 
  ── Failure ('test-postgis_RPostgres.R:100:5'): can handle multiple geom columns ──
  st_crs(x[["geometry.1"]]) not equal to st_crs(multi2[["geometry.1"]]).
  Component "input": 'is.NA' value mismatch: 1 in current 0 in target
  Component "wkt": 'is.NA' value mismatch: 1 in current 0 in target
  ── Failure ('test-tm.R:20:3'): st_read and write handle date and time ──────────
  x[["tm"]] not equal to x2[["tm"]].
  Attributes: < Component "tzone": 1 string mismatch >
  ── Failure ('test-tm.R:40:3'): st_read and write handle date and time ──────────
  x[["tm"]] not equal to x2[["tm"]].
  Attributes: < Component "tzone": 1 string mismatch >
  
  [ FAIL 4 | WARN 4 | SKIP 43 | PASS 884 ]
  Error: Test failures
  Execution halted
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes ... WARNING
Directory 'inst/doc' does not exist.
Package vignettes without corresponding single PDF/HTML:
  ‘sf1.Rmd’
  ‘sf2.Rmd’
  ‘sf3.Rmd’
  ‘sf4.Rmd’
  ‘sf5.Rmd’
  ‘sf6.Rmd’
  ‘sf7.Rmd’
* checking re-building of vignette outputs ... OK
* checking PDF version of manual ... OK
* DONE

Status: 1 ERROR, 2 WARNINGs, 1 NOTE

Are these expected?

@paleolimbot
Copy link
Contributor Author

paleolimbot commented Jul 23, 2024

There are still several tests failing for me on this PR:

I'll revisit in a bit! These are failing tests when specifically using the GDAL stream API, correct? If so, I think it is because nanoarrow implements "timezonelsss" timestamp conversion to R as assigning the UTC timestamp (for portability between the same code running on two computers...readr does this too). I may be remembering the details incorrectly!

(I'm not sure about the CRS test case!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

R CMD check with R_SF_ST_READ_USE_STREAM=true
2 participants