Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{stars} functions for raster data cube and post object interaction, including time matching #1

Open
loreabad6 opened this issue Sep 18, 2024 · 0 comments

Comments

@loreabad6
Copy link
Owner

loreabad6 commented Sep 18, 2024

Having a better interaction between post_array and stars objects with raster data would greatly benefit the enrichment of post objects from raster information.
Currently, the majority of functions do work when giving a sf object, which is then easier to do when passing a post_table in long form, but passing a post_array would be the most intuitive way, specially if the we can work with the dimensions attribute.

Also, enabling time matching for all these functions would be a great way to perform the operations in a "spatio-temporal"-aware manner.

Dummy raster data cube (rdc)

remotes::install_github("loreabad6/stars@time-polygon")
#> Skipping install of 'stars' from a github remote, the SHA1 (f70aca5e) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(stars)
library(sf)
library(post)
library(cubble)
library(tidyverse)

x = seq(-0.5, 1, length.out = 20)
y = seq(0, 1.5, length.out = 20)
s1 = st_as_stars(data.frame(expand.grid(x = x, y = y), z = rnorm(400)))
s2 = st_as_stars(data.frame(expand.grid(x = x, y = y), z = rnorm(400)))
s3 = st_as_stars(data.frame(expand.grid(x = x, y = y), z = rnorm(400)))
s4 = st_as_stars(data.frame(expand.grid(x = x, y = y), z = rnorm(400)))
s5 = st_as_stars(data.frame(expand.grid(x = x, y = y), z = rnorm(400)))
rdc = c(s1,s2,s3,s4,s5, along = "datetime") |> 
  st_set_dimensions(
    "datetime",
    values = unique(polygons$datetime),
  ) |> 
  st_set_crs(4326)

arr = polygons |> as_post_array()
tab = polygons |> as_post_table() |> face_temporal()

1. stars::aggregate.stars()

  • We can pass the polygons from the temporal face of a post_table to aggregate().stars.
  • The way the geometries are then organised within the array is unclear, should be looked into.
  • This should also work for post_array geometries.
  • Including time-matching already at the aggregation level might be beneficial to avoid aggregating on all the geometries as it works in st_extract()
(agg = aggregate(rdc, tab, mean, na.rm = TRUE))
#> stars object with 2 dimensions and 1 attribute
#> attribute(s):
#>         Min.    1st Qu.     Median       Mean   3rd Qu.     Max. NA's
#> z  -1.547645 -0.2601401 0.09971962 0.08011473 0.3912051 1.321302   35
#> dimension(s):
#>          from to     offset  delta refsys point
#> geometry    1 25         NA     NA WGS 84 FALSE
#> datetime    1  5 2020-10-01 1 days   Date    NA
#>                                                                 values
#> geometry POLYGON ((0.5474949 0.808...,...,POLYGON ((0.4750757 0.262...
#> datetime                                                          NULL
st_as_sf(agg)
#> Simple feature collection with 25 features and 5 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -0.2974337 ymin: -0.00297557 xmax: 0.9730806 ymax: 1.153558
#> Geodetic CRS:  WGS 84
#> First 10 features:
#>     2020-10-01   2020-10-02  2020-10-03  2020-10-04 2020-10-05
#> 1  -0.71838423  0.692081576  0.08165758 -0.70646498 -0.2634729
#> 2  -0.30162972 -0.002314025  0.12940001  0.18564799  0.2301074
#> 3   1.32130231  1.016536556  0.26029296  0.87806310  0.4007806
#> 4           NA           NA          NA          NA         NA
#> 5  -1.10219328 -0.250141796  0.24053519  0.40932174  0.2253243
#> 6  -0.07874849  0.153725017 -0.43001741  0.08132226  0.2510174
#> 7   0.14342283 -0.437820714  0.06228408  0.09273263  0.1770414
#> 8           NA           NA          NA          NA         NA
#> 9           NA           NA          NA          NA         NA
#> 10  0.83302258  0.144860774 -0.02748157  0.19337529  0.6889722
#>                          geometry
#> 1  POLYGON ((0.5474949 0.80889...
#> 2  POLYGON ((0.4961102 0.87283...
#> 3  POLYGON ((0.5578801 0.86163...
#> 4  POLYGON ((0.5652241 0.87205...
#> 5  POLYGON ((0.6063791 0.83041...
#> 6  POLYGON ((0.2791708 0.83373...
#> 7  POLYGON ((0.3298312 0.76120...
#> 8  POLYGON ((0.3796448 0.76785...
#> 9  POLYGON ((0.3642467 0.77972...
#> 10 POLYGON ((0.2665368 0.86499...

2. stars::st_extract()

  • An implementation of time matching in st_extract() for polygons is added to loreabad6/stars@time-polygon
  • Currently it does an internal aggregation for every passed geometry and then removes the redundant information.
  • The returned object is an sf, it would be good if, when passing a post_array to the at argument should return a post_array with the extra attribute.
(ext = st_extract(
  rdc,
  at = tab,
  time_column = "datetime",
  FUN = mean,
  na.rm = TRUE
))
#> Simple feature collection with 25 features and 2 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -0.2974337 ymin: -0.00297557 xmax: 0.9730806 ymax: 1.153558
#> Geodetic CRS:  WGS 84
#> First 10 features:
#>              z   datetime                       geometry
#> 1  -0.71838423 2020-10-01 POLYGON ((0.5474949 0.80889...
#> 2   0.09999119 2020-10-02 POLYGON ((0.4961102 0.87283...
#> 3   0.35655307 2020-10-03 POLYGON ((0.5578801 0.86163...
#> 4   0.04403025 2020-10-04 POLYGON ((0.5652241 0.87205...
#> 5  -0.17282679 2020-10-05 POLYGON ((0.6063791 0.83041...
#> 6  -0.07874849 2020-10-01 POLYGON ((0.2791708 0.83373...
#> 7  -0.03736406 2020-10-02 POLYGON ((0.3298312 0.76120...
#> 8  -0.21414059 2020-10-03 POLYGON ((0.3796448 0.76785...
#> 9   0.22340555 2020-10-04 POLYGON ((0.3642467 0.77972...
#> 10  0.19357414 2020-10-05 POLYGON ((0.2665368 0.86499...

arr |> mutate(z = ext$z)
#> stars object with 2 dimensions and 2 attributes
#> attribute(s):
#>          geometry         z           
#>  POLYGON      :25   Min.   :-0.71838  
#>  epsg:4326    : 0   1st Qu.:-0.21414  
#>  +proj=long...: 0   Median : 0.09999  
#>                     Mean   : 0.08041  
#>                     3rd Qu.: 0.35655  
#>                     Max.   : 0.73558  
#> dimension(s):
#>          from to     offset  delta refsys point
#> geom_sum    1  5         NA     NA WGS 84  TRUE
#> datetime    1  5 2020-10-01 1 days   Date FALSE
#>                                                            values
#> geom_sum POINT (0.647816 0.9018588),...,POINT (0.4690683 0.17772)
#> datetime                                                     NULL
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.1 (2024-06-14)
#>  os       Ubuntu 22.04.4 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Etc/UTC
#>  date     2024-09-18
#>  pandoc   3.2 @ /usr/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date (UTC) lib source
#>  abind       * 1.4-8      2024-09-12 [1] RSPM (R 4.4.0)
#>  anytime       0.3.9      2020-08-27 [1] RSPM (R 4.4.0)
#>  class         7.3-22     2023-05-03 [2] CRAN (R 4.4.1)
#>  classInt      0.4-10     2023-09-05 [1] RSPM (R 4.4.0)
#>  cli           3.6.3      2024-06-21 [1] RSPM (R 4.4.0)
#>  colorspace    2.1-1      2024-07-26 [1] RSPM (R 4.4.0)
#>  cubble      * 1.0.0      2024-09-16 [1] Github (huizezhang-sherry/cubble@9563dfe)
#>  DBI           1.2.3      2024-06-02 [1] RSPM (R 4.4.0)
#>  digest        0.6.37     2024-08-19 [1] RSPM (R 4.4.0)
#>  dplyr       * 1.1.4      2023-11-17 [1] RSPM (R 4.4.0)
#>  e1071         1.7-16     2024-09-16 [1] RSPM (R 4.4.0)
#>  ellipsis      0.3.2      2021-04-29 [1] RSPM (R 4.4.0)
#>  evaluate      0.24.0     2024-06-10 [1] RSPM (R 4.4.0)
#>  fansi         1.0.6      2023-12-08 [1] RSPM (R 4.4.0)
#>  fastmap       1.2.0      2024-05-15 [1] RSPM (R 4.4.0)
#>  forcats     * 1.0.0      2023-01-29 [1] RSPM (R 4.4.0)
#>  fs            1.6.4      2024-04-25 [1] RSPM (R 4.4.0)
#>  generics      0.1.3      2022-07-05 [1] RSPM (R 4.4.0)
#>  ggplot2     * 3.5.1      2024-04-23 [1] RSPM (R 4.4.0)
#>  glue          1.7.0      2024-01-09 [1] RSPM (R 4.4.0)
#>  gtable        0.3.5      2024-04-22 [1] RSPM (R 4.4.0)
#>  hms           1.1.3      2023-03-21 [1] RSPM (R 4.4.0)
#>  htmltools     0.5.8.1    2024-04-04 [1] RSPM (R 4.4.0)
#>  KernSmooth    2.23-24    2024-05-17 [2] CRAN (R 4.4.1)
#>  knitr         1.48       2024-07-07 [1] RSPM (R 4.4.0)
#>  lifecycle     1.0.4      2023-11-07 [1] RSPM (R 4.4.0)
#>  lubridate   * 1.9.3      2023-09-27 [1] RSPM (R 4.4.0)
#>  magrittr      2.0.3      2022-03-30 [1] RSPM (R 4.4.0)
#>  munsell       0.5.1      2024-04-01 [1] RSPM (R 4.4.0)
#>  ncdf4         1.23       2024-08-17 [1] RSPM (R 4.4.0)
#>  pillar        1.9.0      2023-03-22 [1] RSPM (R 4.4.0)
#>  pkgconfig     2.0.3      2019-09-22 [1] RSPM (R 4.4.0)
#>  post        * 0.0.0.9000 2024-09-16 [1] local
#>  proxy         0.4-27     2022-06-09 [1] RSPM (R 4.4.0)
#>  purrr       * 1.0.2      2023-08-10 [1] RSPM (R 4.4.0)
#>  R6            2.5.1      2021-08-19 [1] RSPM (R 4.4.0)
#>  Rcpp          1.0.13     2024-07-17 [1] RSPM (R 4.4.0)
#>  readr       * 2.1.5      2024-01-10 [1] RSPM (R 4.4.0)
#>  reprex        2.1.0      2024-01-11 [1] RSPM (R 4.4.0)
#>  rlang         1.1.4      2024-06-04 [1] RSPM (R 4.4.0)
#>  rmarkdown     2.28       2024-08-17 [1] RSPM (R 4.4.0)
#>  rstudioapi    0.16.0     2024-03-24 [1] RSPM (R 4.4.0)
#>  scales        1.3.0      2023-11-28 [1] RSPM (R 4.4.0)
#>  sessioninfo   1.2.2      2021-12-06 [1] RSPM (R 4.4.0)
#>  sf          * 1.0-18     2024-09-18 [1] Github (r-spatial/sf@35f5f8b)
#>  stars       * 0.6-7      2024-09-18 [1] Github (loreabad6/stars@f70aca5)
#>  stringi       1.8.4      2024-05-06 [1] RSPM (R 4.4.0)
#>  stringr     * 1.5.1      2023-11-14 [1] RSPM (R 4.4.0)
#>  tibble      * 3.2.1      2023-03-20 [1] RSPM (R 4.4.0)
#>  tidyr       * 1.3.1      2024-01-24 [1] RSPM (R 4.4.0)
#>  tidyselect    1.2.1      2024-03-11 [1] RSPM (R 4.4.0)
#>  tidyverse   * 2.0.0      2023-02-22 [1] RSPM (R 4.4.0)
#>  timechange    0.3.0      2024-01-18 [1] RSPM (R 4.4.0)
#>  tsibble       1.1.5      2024-06-27 [1] RSPM (R 4.4.0)
#>  tzdb          0.4.0      2023-05-12 [1] RSPM (R 4.4.0)
#>  units         0.8-5      2023-11-28 [1] RSPM (R 4.4.0)
#>  utf8          1.2.4      2023-10-22 [1] RSPM (R 4.4.0)
#>  vctrs         0.6.5      2023-12-01 [1] RSPM (R 4.4.0)
#>  withr         3.0.1      2024-07-31 [1] RSPM (R 4.4.0)
#>  xfun          0.47       2024-08-17 [1] RSPM (R 4.4.0)
#>  yaml          2.3.10     2024-07-26 [1] RSPM (R 4.4.0)
#> 
#>  [1] /usr/local/lib/R/site-library
#>  [2] /usr/local/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@loreabad6 loreabad6 changed the title Let aggregate.stars() accept post_arrays and include time matching {stars} functions for raster data cube and post object interaction, including time matching Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant