Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of a rdm api interaction module #125 #161

Merged
merged 15 commits into from
Oct 22, 2024

Conversation

VincentVerelst
Copy link
Contributor

@VincentVerelst VincentVerelst commented Sep 19, 2024

This is a first version of how we can construct a GeoParquet file from the RDM API, based on a user-defined AOI.

How it works:

import shapely
from worldcereal.rdm_api import query_rdm

poly = shapely.Polygon([(1.824467, 50.34406), (1.824467, 50.950124), (3.735661, 50.950124), (3.735661, 50.34406), (1.824467, 50.34406)])

gdf = query_rdm(poly)

A visualization of the output:

image

All feedback is welcome!

src/worldcereal/utils/__init__.py Outdated Show resolved Hide resolved
src/worldcereal/utils/rdm_interaction.py Outdated Show resolved Hide resolved
src/worldcereal/utils/rdm_interaction.py Outdated Show resolved Hide resolved
src/worldcereal/utils/rdm_interaction.py Outdated Show resolved Hide resolved
src/worldcereal/utils/rdm_interaction.py Outdated Show resolved Hide resolved
src/worldcereal/utils/rdm_interaction.py Outdated Show resolved Hide resolved
src/worldcereal/utils/rdm_interaction.py Outdated Show resolved Hide resolved
@VincentVerelst VincentVerelst marked this pull request as ready for review September 19, 2024 12:53
@VincentVerelst VincentVerelst marked this pull request as draft September 19, 2024 14:17
@VincentVerelst VincentVerelst marked this pull request as ready for review September 20, 2024 09:31
Copy link
Contributor

@kvantricht kvantricht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really great! TBD how we can cope with authenticated calls to RDM which is one more addition that we probably need at this point.

@kvantricht kvantricht self-requested a review September 25, 2024 13:20
src/worldcereal/rdm_api/rdm_interaction.py Outdated Show resolved Hide resolved
@kvantricht kvantricht self-requested a review September 25, 2024 13:43
@VincentVerelst
Copy link
Contributor Author

@kvantricht , quite some changes since your last review:

With the new changes made around authentication, a way to interact with the RDM API would now be as follows:

from worldcereal.rdm_api import RdmInteraction

interaction = RdmInteraction().authenticate()

gdf = interaction.query_rdm(geometry=multi_polygon, temporal_extent=temporal_extent)

Authentication now works similary to the OpenEO Python Client. You can also leave out authenticate(), in which case the rdm query will only interact with public datasets.

For a STAC collection of patch extractions, you can extract the geometries of all patchtes into one MultiPolygon as follows (the following code will probably become part of the patch-to-points extraction script):

import pystac
from shapely.geometry import shape, MultiPolygon

collection = pystac.read_file('/path/to/stac/collection.json')

polygons = []

for item in collection.get_items():  
    polygons.append(shape(item.geometry).buffer(1e-9))  # Add buffer to avoid TopologyException

multi_polygon = MultiPolygon(polygons)
temporal_extent = [collection.extent.temporal.intervals[0][0], collection.extent.temporal.intervals[0][1]]

The results for a query on this MultiPolygon and temporal_extent look like this (cached 64x64 patches in pink, RDM query results in purple):
image

Copy link
Contributor

@kvantricht kvantricht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added two comments, other than that this looks good to merge so we can move on to the next step. Nice job!

pyproject.toml Outdated Show resolved Hide resolved

for id in collection_ids:
url = f"{self.RDM_ENDPOINT}/collections/{id}/download"
response = requests.get(url, headers=self._get_headers())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need some resilience here? what's the timeout?

@VincentVerelst
Copy link
Contributor Author

@kvantricht, added some resilience and a timeout to the RDM interaction. That should cover your final comments!

@kvantricht
Copy link
Contributor

Perfect, thanks!

@kvantricht kvantricht merged commit fb4ebb8 into main Oct 22, 2024
4 checks passed
@kvantricht kvantricht deleted the 125-rdm-api-interaction branch October 22, 2024 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants