Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reviews - edition 2, round 2, part 2 #911

Closed
17 tasks done
Robinlovelace opened this issue Jan 26, 2023 · 2 comments
Closed
17 tasks done

Reviews - edition 2, round 2, part 2 #911

Robinlovelace opened this issue Jan 26, 2023 · 2 comments

Comments

@Robinlovelace
Copy link
Collaborator

Robinlovelace commented Jan 26, 2023

Hot on the heels of #898

  • Foreword needs obviously to be rewritten - much has changed in the geospatial R world since 2018
    (shockingly much) RL

Clearly it does. We have created a placeholder for another foreword for the 2nd edition. We plan to wait until the manuscript is finished, or at least very close, before tackling this.

  • Page xii: "If you are interested in the wider context and motivations behind this book, read on; these
    are covered in Chapter ??." Read on means continue reading here - not jumping to another chapter. RL

We agree this was not well written. Fixed. The relevant section now reads:

The wider context and motivations underlying this book are covered in Chapter 1.

  • Page 11, line 9: "a raster data" should read 'a raster' or 'a raster dataset'. RL

Agreed. The sentence now reads:

The seed_generation tool takes a a raster dataset as its first argument (features); optional arguments include band_width that specifies the size of initial polygons.

  • Page 15, bottom paragraph: link2GI packages have not been mentioned before. How does this term
    relate to the three packages (qgisprocess, Rsagacmd, rgrass) discussed here? Ditto wrt GDAL and
    link2GI on page 21

The link2GI package just makes it easy to initiate a GRASS session from within R without the need to fully grasp how GRASS works in the background. However, for the interested reader or GRASS power users we have added the link to the GRASS help pages which show step-by-step how to do so. Please note that we have deleted the appendix showing the same instructions in favor of the GRASS help pages.

In the case of SAGA and GDAL, link2GI searches the system for the corresponding command line utilities and adds the corresponding paths for the current R session to the PATH variable.
This is unnecessary in the case of qgisprocess since qgisprocess ensures by itself when being attached that a working QGIS version is installed on the system.

  • Page 16 (middle): the paragraph on rgrass <-> terra is demanding, especially for folks who don't
    know GRASS' internal data organization, which is notoriously difficult to grasp for GIS novices (the
    authors address this by adding the GRASS setup appendix - but this comes only on page 18. I suggest
    moving it to page 16, possibly as a textbox insert).

Please see previous reply and reply after the next.

  • It is also counter-intuitive to use terra (which is
    intended to replace the raster package) to create points and lines in a GIS that is mostly used for
    raster operations.

You are right that terra is of course predominantly a raster processing package, however, it also supports vector features and rgrass expects terra::vect() objects as input.

  • The multiple levels of data casting is mind-boggling. The authors seem to
    acknowledge this by pointing to their blog posts and the coerce vignette but this is exactly why this
    example is not suitable for the given audience and in an introduction to GIS-bridging.

We agree that the section in question is demanding and probably more suitable for experienced (GRASS) GIS users. The reasoning behind this is as follows:

  • a former reviewer asked us to provide a more complex GRASS example, specifically one, which cannot be solved using R's "native" spatial capabilities.
  • users specifically wanting to use GRASS are probably already familiar with it, and will therefore struggle less with the example than users who don't need the full power of GRASS. The latter, however, can at least use a subset of its functionality through QGIS.

In any case, we now warn the reader before jumping into the code as follows:

Please note that the code instructions in the following paragraphs might be hard to follow when using GRASS for the first time but by running through the code line-by-line and by examining the intermediate results, the reasoning behind it should become even clearer.

  • Page 18, line 2: GRASS' spatial database is not based on SQLite; GRASS has its own native data
    organization. Instead, the default format for connecting GRASS to an external database using
    db.conneect is SQLite. The same erroneous description is repeated in the discussion of the GRASS
    databse organization.

Thanks for noting, the description was indeed misleading. We have updated the corresponding sections after thoroughly reviewing what GRASS is actually doing in the background (see also #412).

  • Page 25, bottom: I am happy to see mention of GeoMesa and Sedona but the last sentence is
    grammatically garbled. RL

Agreed. See 97edb68 for fix

  • PAGE 26ff: Section 1.7 is a great addition to the second edition of the book!

  • Page 32, 3rd para: The juxtaposition of ML to Bayesian inference is nonsense - the authors are
    misquoting Krainski et al, who use Bayesian techniques for predictions. The omission of the Bayesian
    approach is the one major limitation of the whole volume Gecomputation with R !

My point here was to emphasize that you cannot do statistical inference with ML, but I see why one can misinterpret the sentence. Thinking about it, the inference stuff does not add much value here but is obviously distracting. Therefore, we have removed it.

Secondly, I agree that the Bayesian approach to modeling is quite interesting, however, it is beyond the scope of the book and there are already books out there presenting it in much greater detail than this book ever could. Still, we have updated the section on including spatial autocorrelation in models as follows:

Here, when making predictions we neglect spatial autocorrelation since we assume that on average the predictive accuracy remains the same with or without spatial autocorrelation structures.
However, it is possible to include spatial autocorrelation structures into models as well as into predictions.
Though, this is beyond the scope of this book, we give the interested reader some pointers where to look it up:

  1. The predictions of regression kriging combines the predictions of a regression with the kriging of the regression's residuals [@goovaerts_geostatistics_1997; @hengl_practical_2007; @bivand_applied_2013].
  2. One can also add a spatial correlation (dependency) structure to a generalized least squares model [nlme::gls(); @zuur_mixed_2009; @zuur_beginners_2017].
  3. One can also use mixed-effect modeling approaches.
    Basically, a random effect imposes a dependency structure on the response variable which in turn allows for observations of one class to be more similar to each other than to those of another class [@zuur_mixed_2009].
    Classes can be, for example, bee hives, owl nests, vegetation transects or an altitudinal stratification.
    This mixed modeling approach assumes normal and independent distributed random intercepts.
    This can even be extended by using a random intercept that is normal and spatially dependent.
    For this, however, you will have to resort most likely to Bayesian modeling approaches since frequentist software tools are rather limited in this respect especially for more complex models [@blangiardo_spatial_2015; @zuur_beginners_2017].
  • As for the statistical learning chapter, I would prefer if the authors used a dedicated random forest model such
    as spatialRF or the grf function in the SpatialML package rather than mlr3.

In the statistical learning chapter we focus on performance estimation. The big advantage of using mlr3 is that one can compare dozens or even hundreds of learners, resampling strategies and tasks using the same interface. If the learner in questions does not yet exist, it should be fairly easy to implement it in the mlr3extralearners package. Please refer also to reply to comment Pages 40ff.

  • Page 35, caption for Figure 2.2: It depicts the spatial distribution of susceptibility values. The term
    "spatial prediction" is misleading as the GLM is not a spatial model as in spatial regression. Given the
    importance of spatial and geographically weighted regression (as well as the kriging technique
    mentioned in the following paragraph), the way Jannes is using the term spatial prediction is
    unfortunate. JM

I get the point, however, I have to admit that as far as I know the term "spatial prediction" is not reserved for modeling techniques incorporating the spatial structure in one form or another into the model itself. In any case, wherever possible we replaced "spatial prediction" with predictive mapping or spatial distribution.

  • Page 37, last paragraph: The First Law of Geography was coined by Tobler in 1970, who should be
    cited here, not the symposium summary by Miller in 2004. RL

  • Pages 40ff: This chapter relies heavily on the mlr3 metapackage, which in turn requires quite a lot of
    understanding of machine learning methodology and terminology. What is actually implemented in
    this chapter does not warrant the use of such heavy machinery. GLM and cross-validation are
    standard tools in R and for support vector machines, there are a dozen individual packages available
    that require less background knowledge. finally, if the authors really want to go through the effort of
    explaining concepts like hyperparameters, then I urge them to also introduce Bayesian spatial
    models such as the family of CAR models, Stochastic Partial Differential Equations, or (non-)Gaussian
    Markov Random Fields, much of which is covered by Krainski et al.'s INLA method.

At the beginning of the spatial cv with mlr3 section we point out why we are going to the trouble of learning the mlr3 syntax as follows:

There are dozens of packages for statistical learning, as described for example in the CRAN machine learning task view. Getting acquainted with each of these packages, including how to undertake cross-validation and hyperparameter tuning, can be a time-consuming process. Comparing model results from different packages can be even more laborious. The mlr3 package and ecosystem was developed to address these issues.

Secondly, spatial cross-validation is by no means a standard tool in R packages, only random cross-validation is.
Finally, regarding your suggestion to explain Bayesian spatial models, please refer again to our reply to comment Page 32, 3rd para.

  • Transportation Application is fine. I was using this chapter in my graduate spatial analysis class this
    fall 2022 and it worked without a glitch.

  • Ecology Application is mostly fine as well. I would appreciate it if Jannes could remove the personal
    element (just a style issue). RL could you pls check if there is something off with the style.
    The only personal element I could find is the reference to "one of the most fascinating vegetations we have ever encountered" which I rewrote to "Fog oases are fascinating vegetation formations, locally termed lomas, which develop..."

@Robinlovelace
Copy link
Collaborator Author

  • Foreword needs obviously to be rewritten - much has changed in the geospatial R world since 2018

Taking a look at this one...

Robinlovelace added a commit that referenced this issue Jan 26, 2023
github-actions bot pushed a commit that referenced this issue Feb 13, 2023
jannes-m added a commit that referenced this issue Feb 15, 2023
github-actions bot pushed a commit that referenced this issue Feb 15, 2023
Robinlovelace added a commit that referenced this issue Feb 24, 2023
@Robinlovelace Robinlovelace added this to the 2nd edition Part 2 milestone Feb 26, 2023
@Robinlovelace
Copy link
Collaborator Author

This seems fixed to me 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant