Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify target collection metadata in UDP (& batch jobs) #514

Open
jdries opened this issue Oct 3, 2023 · 2 comments
Open

Specify target collection metadata in UDP (& batch jobs) #514

jdries opened this issue Oct 3, 2023 · 2 comments
Assignees

Comments

@jdries
Copy link

jdries commented Oct 3, 2023

2 new use cases came up, with a similar solution:

  1. Our UDP users want to treat a UDP basically as a 'virtual collection'. To allow this, they would like to know the STAC (collection) metadata of the data cube that is generated when the UDP is invoked. Of course, there can be some unknowns, depending on the parameters in the UDP. Some UDP's are fairly constrained, while others can output any raster cube. This case is relevant for the constrained case, where for instance a UDP wants to communicate constraints on the output. Some examples:
  • produces only data over Europe
  • output from 2017 onwards
  • output resolution is 300m
  • output has 4 bands, with detailed band metadata

Note that this collection metadata also acts as a definition of constraints: if the output aoi is europe, than it will probably not accept an input aoi in north america. So UDP tools can use this for input field validation, which is very useful for generic wizards like the openEO editor has.

  1. The second case is perhaps easier to understand: batch jobs try to fill in as much STAC metadata as possible when generating output, but can not know everything. For instance, a job that generated categorical data can not really know which colors would be suited for visualization. As a user, I would like to submit a kind of metadata template in STAC format, so that I can immediately generate output with more complete STAC metadata.

My proposed solution is to simply add a property with the target STAC collection metadata to the UDP and batch job schema:

https://api.openeo.org/#tag/User-Defined-Processes/operation/store-custom-process

I'll probably experiment with this myself, but also wanted to share the idea. These cases are triggered by user projects.

@jdries jdries self-assigned this Jan 16, 2024
@jdries
Copy link
Author

jdries commented Jun 6, 2024

Update for myself: example of metadata that users are asking for.


{
  "geometry" : "τ1{tend=946684800000,tstart=0,ttype=logical}S2(43199,21599){bbox=[-180.0 180.0 -90.0 90.0],proj=EPSG:4326}",

  "metadata" : {
    "im:keywords" : "global, climate, weather, Average temperature",
    "dc:comment" : "This is WorldClim version 2.1 climate data for 1970-2000. This version was released in January 2020.\r\nThere are monthly climate data for average temperature (°C).\r\nThe data is available at 30 seconds (~1 km2).\r\nFor \"time\", the month scope is inside the semantics data annotation",
    "im:notes" : "",
    "dc:title" : "WorldClim Historical climate data version 2.1 data 30s for 1970-2000 average temperature January",
    "dc:url" : "https://worldclim.org/data/worldclim21.html",
    "dc:creator" : "",
    "im:thematic-area" : "Earth",
    "dc:originator" : "Worldclim",
    "im:geographic-area" : "Global",
    "dc:source" : "Fick, S.E. and R.J. Hijmans, 2017. Worldclim 2: New 1-km spatial resolution climate surfaces for global land areas. International Journal of Climatology 37(12):4302-4315."
  },


}

@m-mohr
Copy link
Member

m-mohr commented Sep 20, 2024

With regards to 2: The GEE driver has a parameter for save_result which can contain STAC metadata that is added to the output. Isn't stac_modify also suitable here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants