Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporal extent is null for vectorcube STAC items #852

Closed
VincentVerelst opened this issue Aug 30, 2024 · 7 comments · Fixed by Open-EO/openeo-python-driver#311 or #864
Closed

Temporal extent is null for vectorcube STAC items #852

VincentVerelst opened this issue Aug 30, 2024 · 7 comments · Fixed by Open-EO/openeo-python-driver#311 or #864
Assignees
Labels

Comments

@VincentVerelst
Copy link

When downloading vectorcube assets like CSV or Parquet from OpenEO, the associated STAC items' temporal interval is null. Here is an example (ran on Terrascope backend):
item-example.json

This makes it so that libraries like pystac don't consider this a valid STAC items and will throw errors.

@bossie
Copy link
Collaborator

bossie commented Sep 6, 2024

From the item spec re: datetime:

null is allowed, but requires start_datetime and end_datetime from common metadata to be set.

@bossie
Copy link
Collaborator

bossie commented Sep 6, 2024

@VincentVerelst just wondering: are you able to open this timeseries.parquet file? I'm getting this error:

shaded.org.apache.avro.SchemaParseException: Illegal character in: S1-SIGMA0-VV

It's probably the two dashes it's complaining about.

@bossie
Copy link
Collaborator

bossie commented Sep 6, 2024

no temporal bounds e.g. load_stac/temporal_extent anywhere in the process graph,
⬇️
no temporal_extent source constraints in the dry run,
⬇️
start_datetime and end_datetime are both None for these results,
⬇️
datetime is None as well.
💥

This means that whether or not the result is a vector cube is irrelevant.

@VincentVerelst this can be worked around by passing an arbitrarily large temporal_extent to one of those load_stac processes e.g. ["1970-01-01", "2070-01-01"]

bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 9, 2024
@VincentVerelst
Copy link
Author

@VincentVerelst just wondering: are you able to open this timeseries.parquet file? I'm getting this error:

shaded.org.apache.avro.SchemaParseException: Illegal character in: S1-SIGMA0-VV

It's probably the two dashes it's complaining about.

What are you using to open the file? For me it works using GeoPandas (which itself uses PyArrow), but I think it's because it ignores the schema validation.

@bossie
Copy link
Collaborator

bossie commented Sep 11, 2024

I was using the Java Parquet CLI; works with GeoPandas though 👍

bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 11, 2024
bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 12, 2024
bossie added a commit that referenced this issue Sep 12, 2024
@bossie bossie linked a pull request Sep 12, 2024 that will close this issue
bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 12, 2024
bossie added a commit that referenced this issue Sep 12, 2024
bossie added a commit to Open-EO/openeo-geopyspark-driver-testdata that referenced this issue Sep 13, 2024
bossie added a commit that referenced this issue Sep 13, 2024
bossie added a commit that referenced this issue Sep 13, 2024
:vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::_List_iterator<
unsigned long> > >, std::_Select1st<std::pair<unsigned long const, std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<std::__cxx11::basic_string<char
, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::_List_iterator<unsigned long> > > >, std::less<unsigne
d long>, std::allocator<std::pair<unsigned long const, std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<std::__cxx11::basic_string<char, std::char_
traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::_List_iterator<unsigned long> > > > >::find(unsigned long const&)+
0x11
/core.366)

#852
bossie added a commit to Open-EO/openeo-geopyspark-driver-testdata that referenced this issue Sep 13, 2024
bossie added a commit that referenced this issue Sep 13, 2024
bossie added a commit that referenced this issue Sep 13, 2024
bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 13, 2024
bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 13, 2024
bossie added a commit that referenced this issue Sep 13, 2024
bossie added a commit that referenced this issue Sep 13, 2024
bossie added a commit that referenced this issue Sep 16, 2024
bossie added a commit that referenced this issue Sep 16, 2024
@bossie
Copy link
Collaborator

bossie commented Sep 16, 2024

Much like e.g. GeoTiff Items provide a datetime property in their asset metadata regardless of the temporal_extent source constraints in the process graph, this will populate start_datetime and end_datetime properties.

bossie added a commit that referenced this issue Sep 16, 2024
* propagate temporal extent from load_stac

#852

* fix tests

#852

* add test

#852

* quick fix for new test crashing the JVM when running all tests

:vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::_List_iterator<
unsigned long> > >, std::_Select1st<std::pair<unsigned long const, std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<std::__cxx11::basic_string<char
, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::_List_iterator<unsigned long> > > >, std::less<unsigne
d long>, std::allocator<std::pair<unsigned long const, std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<std::__cxx11::basic_string<char, std::char_
traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::_List_iterator<unsigned long> > > > >::find(unsigned long const&)+
0x11
/core.366)

#852

* reference fixed test data

#852

* debug KeyError: start_datetime

#852

* reference openeo_driver with fix

#852

* simplify filter_temporal

#852

* DRY: move GDALWarp.deinit() elsewhere

#852

* merge_cubes: merge temporal extents

#852
#861
bossie added a commit that referenced this issue Sep 16, 2024
bossie added a commit that referenced this issue Sep 17, 2024
Traceback (most recent call last):
  File "batch_job.py", line 530, in start_main
    main(sys.argv)
  File "batch_job.py", line 206, in main
    run_driver()
  File "batch_job.py", line 167, in run_driver
    run_job(
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/utils.py", line 56, in memory_logging_wrapper
    return function(*args, **kwargs)
  File "batch_job.py", line 285, in run_job
    result = ProcessGraphDeserializer.evaluate(process_graph, env=env, do_dry_run=tracer)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 428, in evaluate
    result = convert_node(top_level_node, env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 453, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 485, in convert_node
    return [convert_node(x, env=env) for x in processGraph]
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 485, in <listcomp>
    return [convert_node(x, env=env) for x in processGraph]
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 453, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 471, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 453, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1676, in apply_process
    the_mask = convert_node(mask_node, env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 471, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 453, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 471, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 453, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 471, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 453, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1695, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 471, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 453, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1729, in apply_process
    return process_function(args=ProcessArgs(args, process_id=process_id), env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1333, in filter_temporal
    return cube.filter_temporal(start=extent[0], end=extent[1])
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py", line 77, in run
    return func(*args, **kwargs)
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py", line 154, in filter_temporal
    metadata=self.metadata.filter_temporal(start, end)
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/geopysparkcubemetadata.py", line 78, in filter_temporal
    raise ValueError(start, end)
ValueError: ('2018-08-14', '2018-08-14')

eu-cdse/openeo-cdse-infra#261
#852
@bossie
Copy link
Collaborator

bossie commented Sep 17, 2024

Fixed on https://openeo-dev.vito.be; the workaround is no longer necessary @VincentVerelst.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment