Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

depreciation of shapefiles #62

Open
Nowosad opened this issue Nov 17, 2022 · 19 comments
Open

depreciation of shapefiles #62

Nowosad opened this issue Nov 17, 2022 · 19 comments

Comments

@Nowosad
Copy link
Owner

Nowosad commented Nov 17, 2022

@rsbivand As we discussed last month, several shapefiles exist in the package (https://github.com/Nowosad/spData/tree/master/inst/shapes), that we may consider depreciating. However, I do not know how to do it properly (the depreciation process in R is fairly well documented for functions, but not for external data). Do you know how to do it best?

@rsbivand
Copy link
Contributor

rsbivand commented Nov 24, 2022

I think we'll have to find ways of creating GPKG for:

  • auckland
  • baltim
  • boston_tracts
  • columbus
  • eire
  • sids
  • wheat

where several have no known CRS. Then update the help pages to use the GPKG, maybe adding a note that the shapefiles may be withdrawn at a forthcoming release. A similar message might be added to the package startup messages, but that might wait until the release before they were dropped. We'll need to watch the size of the package.

@rsbivand
Copy link
Contributor

See also: r-spatial/sf#2049

@rsbivand
Copy link
Contributor

@Nowosad may I move on this? Should we retain the shapefiles or replace them? Should I try to instrument revdeps?

@Nowosad
Copy link
Owner Author

Nowosad commented May 15, 2024

@rsbivand yes, please! I think we should aim for removing the shapefiles and replacing them with, e.g., gpkgs. Revdeps should be useful to do to check the impact of the change.

Also -- I've noticed that shapefiles are used quite a lot in docs of sf (see https://github.com/r-spatial/sf/blob/main/R/read.R and https://github.com/r-spatial/sf/tree/main/inst/shape). Do you think it would be worthwhile also to depreciate shapefiles there?

@rsbivand
Copy link
Contributor

Re. sf, reasonable, I'll raise an issue there.

@rsbivand
Copy link
Contributor

rsbivand commented May 31, 2024

@Nowosad GPKG created for all shapefiles but baltim which is arguably redundant.

I suggest submitting this soon, passes CMD check --as-cran with 4.4.0 and a recent devel. Once it is on CRAN, I'll start the reverse dependency checks, which will involve scanning source packages for patterns like .shp", package="spData" to see which packages are obviously affected, then writing to those maintainers. Suggested time line: shapefiles to be moved to inst/shapes/shp from inst/shapes 3 months after 2.3.1 is on CRAN, and inst/shapes/shp to be removed 3 months on again. If anyone really needs to read the shapefile, we retain only the needed shapefiles in inst/shapes/shp. Comments, please!

@rsbivand
Copy link
Contributor

rsbivand commented Jun 5, 2024

The possible "most" dependent packages were:

"apsimx" us_states
"bayesTFR" world
"bispdep" (shp below)
"classInt" jenks71, afcon
"echelon" nc.sids
"epiR" (shp below)
"geonetwork" world
"GWmodel" boston
"GWnnegPCA" shapes/boston_tracts.shp (split line) gw_nsprcomp.Rd
"MainExistingDatasets" world
"oceanic" shapes/world.gpkg
"PopGenHelpR" world, us_states
"R2BayesX" (shp below)
"raybevel" us_states
"rayrender" us_states
"rcartocolor" world
"rflexscan" (shp below)
"RPyGeo" nz
"scgwr" boston
"spatialreg" (shp below)
"spdep" (shp below)
"spgwr" (shp below)
"sphet" (shp below)
"sqlhelper" congruent, incongruent
"TeachingDemos" world, state.vbm
"tilemaps" us_states
"varycoef" house

but of these only the following (plus GWnnegPCA above) had hits on system(paste0("grep 'shp.*spData' ", dwn[i,1], "/*/*", sep=""):

[[1]]
character(0)
attr(,"status")
[1] 2

[[2]]
character(0)
attr(,"status")
[1] 2

[[3]]
 [1] "bispdep/man/connectivity.map.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"
 [2] "bispdep/man/correlogram.bi.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"  
 [3] "bispdep/man/getis.cluster.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"   
 [4] "bispdep/man/localmoran.bi.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"   
 [5] "bispdep/man/moranbi.cluster.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r" 
 [6] "bispdep/man/moranbi.plot.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"    
 [7] "bispdep/man/moranbir.test.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"   
 [8] "bispdep/man/moranbi.test.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"    
 [9] "bispdep/man/moran.cluster.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"   
[10] "bispdep/man/spcorrelogram.bi.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"

[[4]]
character(0)
attr(,"status")
[1] 2

[[5]]
character(0)
attr(,"status")
[1] 1

[[6]]
[1] "epiR/vignettes/epiR_descriptive.Rmd:ncsidsll.sf <- st_read(dsn = system.file(\"shapes/sids.shp\", package = \"spData\")[1])\r"
attr(,"status")
[1] 2

[[7]]
character(0)
attr(,"status")
[1] 2

[[8]]
character(0)
attr(,"status")
[1] 2

[[9]]
character(0)
attr(,"status")
[1] 1

[[10]]
character(0)
attr(,"status")
[1] 2

[[11]]
character(0)
attr(,"status")
[1] 1

[[12]]
character(0)
attr(,"status")
[1] 2

[[13]]
[1] "R2BayesX/man/nbAndGraConversion.Rd:  columbus <- readOGR(system.file(\"shapes/columbus.shp\", package=\"spData\")[1])"
attr(,"status")
[1] 2

[[14]]
character(0)
attr(,"status")
[1] 1

[[15]]
character(0)
attr(,"status")
[1] 2

[[16]]
character(0)
attr(,"status")
[1] 2

[[17]]
[1] "rflexscan/man/choropleth.Rd:sids.shp <- read_sf(system.file(\"shapes/sids.shp\", package=\"spData\")[1])\r"
[2] "rflexscan/R/flexscan.R:#' sids.shp <- read_sf(system.file(\"shapes/sids.shp\", package=\"spData\")[1])\r"  

[[18]]
character(0)
attr(,"status")
[1] 2

[[19]]
character(0)
attr(,"status")
[1] 1

[[20]]
 [1] "spatialreg/man/aple.mc.Rd:wheat <- st_read(system.file(\"shapes/wheat.shp\", package=\"spData\")[1], quiet=TRUE)"               
 [2] "spatialreg/man/aple.plot.Rd:wheat <- st_read(system.file(\"shapes/wheat.shp\", package=\"spData\")[1], quiet=TRUE)"             
 [3] "spatialreg/man/aple.Rd:wheat <- st_read(system.file(\"shapes/wheat.shp\", package=\"spData\")[1], quiet=TRUE)"                  
 [4] "spatialreg/man/impacts.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
 [5] "spatialreg/man/ME.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"              
 [6] "spatialreg/man/ME.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"                   
 [7] "spatialreg/man/sarlm_tests.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
 [8] "spatialreg/man/sparse_mat.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"      
 [9] "spatialreg/man/SpatialFiltering.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"
[10] "spatialreg/man/spautolm.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"             
[11] "spatialreg/man/trW.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"             
[12] "spatialreg/vignettes/nb_igraph.Rmd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1])"            
[13] "spatialreg/vignettes/sids_models.Rmd:nc <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"        
[14] "spatialreg/vignettes/SpatialFiltering.Rmd:NY8 <- st_read(system.file(\"shapes/NY8_utm18.shp\", package=\"spData\"))"            
attr(,"status")
[1] 2

[[21]]
 [1] "spdep/man/autocov_dist.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"   
 [2] "spdep/man/choynowski.Rd:auckland <- st_read(system.file(\"shapes/auckland.shp\", package=\"spData\")[1], quiet=TRUE)"     
 [3] "spdep/man/columbus.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"       
 [4] "spdep/man/compon.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
 [5] "spdep/man/diffnb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
 [6] "spdep/man/dnearneigh.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
 [7] "spdep/man/EBest.Rd:auckland <- st_read(system.file(\"shapes/auckland.shp\", package=\"spData\")[1], quiet=TRUE)"          
 [8] "spdep/man/EBImoran.mc.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"         
 [9] "spdep/man/EBlocal.Rd:auckland <- st_read(system.file(\"shapes/auckland.shp\", package=\"spData\")[1], quiet=TRUE)"        
[10] "spdep/man/edit.nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"        
[11] "spdep/man/globalG.test.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"        
[12] "spdep/man/graphneigh.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
[13] "spdep/man/include.self.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"   
[14] "spdep/man/joincount.multi.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"
[15] "spdep/man/joincount.test.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)" 
[16] "spdep/man/knearneigh.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
[17] "spdep/man/knn2nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
[18] "spdep/man/listw2sn.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"       
[19] "spdep/man/lm.morantest.exact.Rd:eire <- st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1])"                 
[20] "spdep/man/lm.morantest.sad.Rd:eire <- st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1])"                   
[21] "spdep/man/localGS.Rd:boston.tr <- sf::st_read(system.file(\"shapes/boston_tracts.shp\", package=\"spData\")[1])\r"        
[22] "spdep/man/localmoran_bv.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\"))"                 
[23] "spdep/man/localmoran.exact.Rd:eire <- st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1])"                   
[24] "spdep/man/localmoran.sad.Rd:eire <- st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1])"                     
[25] "spdep/man/mat2listw.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"      
[26] "spdep/man/moran.test.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
[27] "spdep/man/nb2lines.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"       
[28] "spdep/man/nb2listw.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"       
[29] "spdep/man/nb2mat.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
[30] "spdep/man/nbdists.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"        
[31] "spdep/man/nblag.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"          
[32] "spdep/man/nboperations.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"   
[33] "spdep/man/plot.nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"        
[34] "spdep/man/poly2nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"        
[35] "spdep/man/poly2nb.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"             
[36] "spdep/man/probmap.Rd:auckland <- st_read(system.file(\"shapes/auckland.shp\", package=\"spData\")[1], quiet=TRUE)"        
[37] "spdep/man/SD.RStests.Rd:columbus <- sf::st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1])"             
[38] "spdep/man/sp.correlogram.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"      
[39] "spdep/man/subset.listw.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"   
[40] "spdep/man/subset.nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"      
[41] "spdep/man/summary.nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
[42] "spdep/man/testnb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
[43] "spdep/man/tri2nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
[44] "spdep/vignettes/CO69.Rmd:eire <- as(sf::st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1]), \"Spatial\")"   
[45] "spdep/vignettes/nb.Rmd:NY8 <- as(sf::st_read(system.file(\"shapes/NY8_utm18.shp\", package=\"spData\")), \"Spatial\")"    
[46] "spdep/vignettes/nb_sf.Rmd:NY8_sf_old <- st_read(system.file(\"shapes/NY8_utm18.shp\", package=\"spData\"), quiet=TRUE)"   
[47] "spdep/vignettes/sids.Rmd:nc <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"              
attr(,"status")
[1] 2

[[22]]
[1] "spgwr/man/ggwr.Rd:xx <- as(st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1]), \"Spatial\")"         
[2] "spgwr/man/ggwr.sel.Rd:xx <- as(st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1]), \"Spatial\")"     
[3] "spgwr/vignettes/GWR.Rmd:NY8 <- as(st_read(system.file(\"shapes/NY8_utm18.shp\", package=\"spData\")), \"Spatial\")"
attr(,"status")
[1] 2

[[23]]
[1] "sphet/man/impacts.error_sphet.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"
[2] "sphet/man/impacts.ols_sphet.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"  
[3] "sphet/man/impacts.stsls_sphet.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"
[4] "sphet/R/impacts.R:#' columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"            
[5] "sphet/R/impacts.R:#' columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"            
[6] "sphet/R/impacts.R:#' columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"            
attr(,"status")
[1] 2

[[24]]
character(0)
attr(,"status")
[1] 2

[[25]]
character(0)
attr(,"status")
[1] 1

[[26]]
character(0)
attr(,"status")
[1] 2

[[27]]
character(0)
attr(,"status")
[1] 2

so:

bispdep: shapes/columbus.shp
epiR: shapes/sids.shp
GWnnegPCA: shapes/boston_tracts.shp
R2BayesX: shapes/columbus.shp
rflexscan: sids.shp
spatialreg: shapes/wheat.shp, shapes/columbus.shp, shapes/sids.shp, shapes/NY8_utm18.shp
spdep: shapes/columbus.shp, shapes/auckland.shp, shapes/sids.shp, shapes/eire.shp, shapes/boston_tracts.shp, shapes/NY8_utm18.shp
spgwr: shapes/sids.shp, shapes/NY8_utm18.shp
sphet: shapes/columbus.shp

I'll add issues here as I raise them, or dates of emails sent. None appear to need the shapefile representation.

@edzer
Copy link

edzer commented Jun 6, 2024

I've noticed that shapefiles are used quite a lot in docs of sf

The original reason for that was that at the time of first submission, certain CRAN platforms did not have sqlite linked to gdal, and could not read gpkg. It could well be that this no longer is the case.

@rsbivand
Copy link
Contributor

rsbivand commented Jun 6, 2024

@rsbivand
Copy link
Contributor

rsbivand commented Aug 2, 2024

@Nowosad epiR has updated, R2BayesX is being submitted, the remainder have been reminded that they need to update by the end of August.

@Nowosad
Copy link
Owner Author

Nowosad commented Sep 1, 2024

@rsbivand as it is September already -- should I submit the current version of spData to CRAN now?

@rsbivand
Copy link
Contributor

rsbivand commented Sep 1, 2024

I'll check tomorrow. Are we going to drop all the shapefiles now?

@Nowosad
Copy link
Owner Author

Nowosad commented Sep 1, 2024

I think the plan is to put the current version to CRAN as it is, and drop all the shapefiles afterwards.

@rsbivand
Copy link
Contributor

rsbivand commented Sep 1, 2024

Ok, submit as main is now, then I'll make a PR to drop the shapefiles.

@rsbivand
Copy link
Contributor

rsbivand commented Sep 2, 2024

@Nowosad Please wait, I need to check the validity of shapes/NY8_bna_utm18.gpkg and shapes/NY8_utm18.gpkg.

@Nowosad
Copy link
Owner Author

Nowosad commented Sep 2, 2024

One additional issue, @rsbivand:

Found the following (possibly) invalid URLs:
    URL: http://www.spatial-econometrics.com/html/jplv7.zip
      From: man/elect80.Rd
            man/house.Rd
      Status: Error
      Message: Failed to connect to [www.spatial-econometrics.com](http://www.spatial-econometrics.com/) port 80 after 21145 ms: Couldn't connect to server

@rsbivand
Copy link
Contributor

rsbivand commented Sep 2, 2024

Will update this shortly - Jim LeSage has retired and the DNS subscription has I think lapsed. I'm dropping the URL markup - if anyone needs the original, it may be found in the wayback machine: https://web.archive.org/web/20160416132456fw_/http://www.spatial-econometrics.com/html/jplv7.zip

rsbivand added a commit to rsbivand/spData that referenced this issue Sep 2, 2024
Nowosad added a commit that referenced this issue Sep 2, 2024
fixes for 2.3.2 submission for #62
@Nowosad
Copy link
Owner Author

Nowosad commented Sep 2, 2024

thanks, package spData_2.3.3.tar.gz is on its way to CRAN.

@mdsumner
Copy link

I saw mention of reading from .zip with GDAL 3.7.0, so here's a quick note on support that's older.

it works in gdal 3.4.3 when using the actual virtual file protocols

ogrinfo /vsizip/GB_election_2024_sim.gpkg.zip
INFO: Open of `/vsizip/GB_election_2024_sim.gpkg.zip'
      using driver `GPKG' successful.
1: GB_election_2024_sim
GDAL 3.4.3, released 2022/04/22

no need even to download

 ogrinfo /vsizip//vsicurl/https://github.com/user-attachments/files/16865728/GB_election_2024_sim.gpkg.zip

INFO: Open of `/vsizip//vsicurl/https://github.com/user-attachments/files/16865728/GB_election_2024_sim.gpkg.zip'
      using driver `GPKG' successful.
1: GB_election_2024_sim

Rscript -e 'sf::read_sf("/vsizip//vsicurl/https://github.com/user-attachments/files/16865728/GB_election_2024_sim.gpkg.zip", 
   query = "SELECT * FROM GB_election_2024_sim LIMIT 1")'
Simple feature collection with 1 feature and 19 fields
Geometry type: POLYGON
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants