Point extractions sentinel 2 #58

VincentVerelst · 2024-06-12T13:32:01Z

Some comments from @kvantricht to be followed up here:

"date" has dtype object (typical for strings in dataframes); isn't it possible to save it immediately in datetime format? Especially if we want to filter temporally afterwards, we'd need it like that
what's the use of feature_index column?
the bands are currently in float64, while the scaling in gfmap was designed to be able to save as UINT16; if we don't have the option to tell openEO to save it like that, we should make the conversion in a post-job action to avoid unnecessary storage. Have to make sure in this case the nodata value of 65535 was correctly used.
"geometry" is in WKB format (i think). Can we easily interact with it like this? E.g. I can not load this parquet with geopandas as it does not understand the geometry. Actually it should understand WKB for geoparquet but only with the correct metadata i think.
out of curiosity i'm wondering why also samples with extract==False are extracted. I guess for this exercise you're extracting everything? I don't think it's a necessary column in the output but ok.
where do irrigation_label, croptype_label and landcover_lavel come from now? Directly from the input file? This wasn't rasterized and exported as STAC compatible collection so i'm wondering how we get this info in here. Unfortunately still with the old mappings for crop type (cc Christina Butsko). irrigation label is not needed in our parquet files anyway

…esults.

…data

… some columns

VincentVerelst · 2024-06-13T13:52:12Z

Updated the script. A new output file can be found at /data/users/Public/vincent.verelst/extraction_test/35NPF/2018_SSD_WFP-field-survey_POLY_110_1674.parquet

As a general remark: the previous extraction was run on an old reference dataset, just as a test. The current extraction has been run on a new reference dataset.

@kvantricht, to address your comments in the order as listed above:

Changed the dtype of the 'date' column to 'datetime' in the post_job_actions
The feature_index is given by OpenEO automatically. Each geometry to be extracted gets its own index to keep track of it. In the example above, there were 21 Points to extract, so 21 different feature_indices
The output should indeed be uint16. I've contacted the openeo devs to check why this is not the case. As a temporary fix I've converted manually to uint16 in the post_job_actions
Could you try reading the new result with geopandas? It works for me
Extract flags are given by 0 and 1 (instead of False, True) in the newest format of reference datasets. In the newest example, only features with extract==1 are extracted
All other columns in the geoparquet file are directly taken over from the original reference dataset. These point extractions are directly run on the original reference datasets, not on the S2 patch extractions.

kvantricht · 2024-06-18T09:45:10Z

@VincentVerelst thanks for the changes!

the fact that we're having weird and/or duplicate attributes like [xx, sampleID, sample_id, irrigation_status, IRRIGATION, CROP, ID, ARMYWORM, CT, IMPACT, None, ...] is all due to the input file? If yes, @cbutsko, we need to check this come up with a list of attributes we want to subset on each time knowing that these make it into the output parquet files.

reading with geopandas works for me too, thanks.

so looking good for me!

kvantricht · 2024-06-18T09:46:44Z

scripts/extractions/point_extractions/point_extractions.py

+    backend_context = BackendContext(backend)
+
+    # TODO: Adjust this to the desired bands to download
+    bands_to_download = [


will need to expand to S1 and meteo too. Has to reflect the inputs required by the models.

scripts/extractions/point_extractions/point_extractions.py

VincentVerelst · 2024-06-19T09:04:25Z

@kvantricht all the attributes are indeed also present in the input file. The reference data used for this example can be found at /data/users/Public/vincent.verelst/extraction_test/08_2018_SSD_WFP-field-survey_POLY_110.geoparquet.

…/worldcereal-classification into vv_point_extractions

…ocessing of the inference

VincentVerelst added 12 commits May 28, 2024 16:03

added an example script for sentinel2 timeseries point extractions

bbe6dc2

formatting of the point_extractions script

10bf8af

adjusted raw_datacube_s2 to handle point extractions as well

53adf25

removed masked_cube function

733dd7f

added an example script for sentinel2 timeseries point extractions

da9e31c

formatting of the point_extractions script

575c377

adjusted raw_datacube_s2 to handle point extractions as well

bf58f68

removed masked_cube function

92fb911

changed point extraction example to extract EVI as example

0ebf058

changed point extractions script to include a time dimension in the r…

28bc5b7

…esults.

small adjustments to raw_datacube_s2

a5fa384

adjusted raw_datacube_s2 to also handle point extractions

dcb95e9

VincentVerelst requested a review from GriffinBabe June 12, 2024 13:32

VincentVerelst added 2 commits June 12, 2024 15:33

removed unnecessary imports

f20e64e

small changes to point extractions to handle new format of reference …

8d624b5

…data

VincentVerelst removed the request for review from GriffinBabe June 13, 2024 12:29

added post_job_action to the point extractions to change the dtype of…

0cddbf1

… some columns

kvantricht reviewed Jun 18, 2024

View reviewed changes

jdegerickx and others added 5 commits June 19, 2024 11:29

Merge branch 'vv_point_extractions' of https://github.com/WorldCereal…

33eabe3

…/worldcereal-classification into vv_point_extractions

changed the preprocessing of the point extractions to match the prepr…

286bbc0

…ocessing of the inference

changed preprossing to also handle point extractions

aa45245

formatting

a3c1f97

removed unnecessary imports

52d8be0

VincentVerelst marked this pull request as ready for review June 20, 2024 14:15

VincentVerelst marked this pull request as draft June 20, 2024 14:20

kvantricht closed this Jul 9, 2024

kvantricht deleted the vv_point_extractions branch July 9, 2024 13:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Point extractions sentinel 2 #58

Point extractions sentinel 2 #58

VincentVerelst commented Jun 12, 2024 •

edited

Loading

VincentVerelst commented Jun 13, 2024

kvantricht commented Jun 18, 2024

kvantricht Jun 18, 2024

VincentVerelst commented Jun 19, 2024

Point extractions sentinel 2 #58

Point extractions sentinel 2 #58

Conversation

VincentVerelst commented Jun 12, 2024 • edited Loading

VincentVerelst commented Jun 13, 2024

kvantricht commented Jun 18, 2024

kvantricht Jun 18, 2024

Choose a reason for hiding this comment

VincentVerelst commented Jun 19, 2024

VincentVerelst commented Jun 12, 2024 •

edited

Loading