Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_stac missing data when stiching two tiles #778

Assignees
Labels

Comments

@VictorVerhaert
Copy link

When using load stac on the following collection: https://stac.openeo.vito.be/collections/tree_cover_density_2018
(job_id: j-2405173083f249c2bcc9c07be6e65416)
I get the following missing data:

image

From the load_stac api call
STAC API GET https://stac.openeo.vito.be/search?limit=20&bbox=11.1427023295687%2C47.22033843316067%2C11.821519349155245%2C47.628952581107114&datetime=1970-01-01T00%3A00%3A00Z%2F2069-12-31T23%3A59%3A59.999000Z&collections=tree_cover_density_2018&fields=
i fethed the two matching tiff files (given other color for clarity):
image

where you can see that the data from the red tile does exist but is not correctly loaded in. (the top line of the missing rectangle corresponds exactly to the dividing line of the two tiles.

used openeo code on CDSE:

spatial_extent = {'west': 11.1427023295687, 'south': 47.22033843316067, 'east': 11.821519349155245, 'north': 47.628952581107114}
landsat = connection.load_stac(
    "https://stac.openeo.vito.be/collections/tree_cover_density_2018",
    spatial_extent=spatial_extent,
).max_time().execute_batch("TCD.tif", title="TCD")
bossie added a commit to Open-EO/openeo-geotrellis-extensions that referenced this issue May 23, 2024
@bossie bossie linked a pull request May 23, 2024 that will close this issue
bossie added a commit to Open-EO/openeo-geotrellis-extensions that referenced this issue May 23, 2024
bossie added a commit to Open-EO/openeo-geotrellis-extensions that referenced this issue May 23, 2024
bossie added a commit to Open-EO/openeo-geotrellis-extensions that referenced this issue May 23, 2024
@bossie bossie linked a pull request May 23, 2024 that will close this issue
@bossie
Copy link
Collaborator

bossie commented May 24, 2024

Pushed a quick fix that circumvents the problem in the case of load_stac, now looks like this on staging:

quickfix_scaled

bossie added a commit to Open-EO/openeo-geotrellis-extensions that referenced this issue May 27, 2024
bossie added a commit to Open-EO/openeo-geotrellis-extensions that referenced this issue May 27, 2024
@bossie
Copy link
Collaborator

bossie commented May 27, 2024

The problem occurs in regions where two features meet and SpaceTimeKeys typically overlap both of these features.

In this case, the SpaceTimeKey (purple) is fullyContained within the bbox of the top feature (red) so only this GeoTiff asset will be taken into account and the bottom one discarded. Unfortunately the bbox does not match the actual footprint of the asset and the asset does not have data to fully cover the SpaceTimeKey: the gap.

initial_spacetimekey_fully_overlaps_bbox

@bossie
Copy link
Collaborator

bossie commented May 27, 2024

@jdries is this optimization something that we want to have for load_stac as well (I'm assuming yes)? The quick fix I did essentially bypasses it for load_stac.

Otherwise, the real fix is twofold:

  1. load_stac should consider a STAC Item's geometry rather than its bbox; this should not be hard to implement.
  2. the geometries in the STAC Items in this collection do not match their asset's actual footprint so they will have to be fixed (reingested?).

Fixing the footprints will consider both assets and therefore remove the gap:

actual_footprints_spacetimekey_overlaps_both_assets

@jdries
Copy link
Contributor

jdries commented May 27, 2024

I'm not sure if we need the optimization: most products are generated without any overlap. The huge amount of overlap applied to sentinel-2 is rather the exception. In addition to that, we do the optimization for sentinel-2, because it is such a commonly used collection. For load_stac, it is probably better to be on the safe side and load a bit more data.

I do believe that we should consider fixing the footprints, and also using the geometry rather than bbox should be a good idea in general.

bossie added a commit to Open-EO/openeo-geotrellis-extensions that referenced this issue May 27, 2024
@bossie
Copy link
Collaborator

bossie commented May 27, 2024

@VictorVerhaert is it Stijn C. that is responsible for https://stac.openeo.vito.be/collections/tree_cover_density_2018 or who should I bother?

@VictorVerhaert
Copy link
Author

@bossie I made that collection myself.
Can you clarify what is exactly wrong? Is it just the bbox that doesn't match?

@bossie
Copy link
Collaborator

bossie commented May 27, 2024

@VictorVerhaert At least bbox and geometry, haven't checked proj:bbox and proj:geometry.

This item for example: https://stac.openeo.vito.be/collections/tree_cover_density_2018/items/TCD_2018_010m_E44N27_03035_v020

reports a geometry of:

{
  "type": "Polygon",
  "coordinates": [
    [
      [
        11.064548187608006,
        47.38783029804821
      ],
      [
        11.064548187608006,
        48.3083796083107
      ],
      [
        12.36948893966052,
        48.3083796083107
      ],
      [
        12.36948893966052,
        47.38783029804821
      ],
      [
        11.064548187608006,
        47.38783029804821
      ]
    ]
  ]
}

whereas I would expect it to be something like:

{
  "type": "Polygon",
  "coordinates": [
    [
      [
        11.046005504476401,
        47.40858428037738
      ],
      [
        11.707867449704809,
        47.40021736186508
      ],
      [
        12.36948893966052,
        47.38783030409527
      ],
      [
        12.390240820693707,
        47.837566260620925
      ],
      [
        12.411462626880093,
        48.28720072607632
      ],
      [
        11.738134164531402,
        48.29984134090657
      ],
      [
        11.064548187608006,
        48.30837961418922
      ],
      [
        11.055172953154765,
        47.85853023272656
      ],
      [
        11.046005504476401,
        47.40858428037738
      ]
    ]
  ]
}

such that it matches the actual footprint of the GeoTiff asset.

bossie added a commit that referenced this issue May 28, 2024
@bossie bossie linked a pull request May 28, 2024 that will close this issue
bossie added a commit that referenced this issue May 28, 2024
@bossie
Copy link
Collaborator

bossie commented May 28, 2024

Disabled the optimization in case of load_stac (quick fix became real fix).

load_stac will take a STAC Item's geometry property into account as well (needs a recent openeo-opensearch-client).

@bossie bossie reopened this May 28, 2024
bossie added a commit to Open-EO/openeo-geotrellis-extensions that referenced this issue May 28, 2024
Open-EO/openeo-geopyspark-driver#778

The quick fix (disable the optimization in case of load_stac) became the real fix so these are essentially cosmetic changes.
bossie added a commit that referenced this issue May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment