-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation complains about missing products on CDSE #566
Comments
to reproduce (requires openeo client >= 0.24.0): import openeo
con = openeo.connect("openeo.dataspace.copernicus.eu")
con.authenticate_oidc()
cube = con.load_collection(
"SENTINEL2_L2A",
temporal_extent=["2022-06-01", "2022-06-10"],
spatial_extent={"west": 3, "south": 51, "east": 3.01, "north": 51.01},
bands=["B02"]
)
cube.download("tmp.nc") this will show warning
To directly get validation report without having to wait for the print(con.validate_process_graph(cube)) |
FYI as discussed: I pushed a quick workaround to avoid spamming users with buggy validation reports: extensive collection based validation is disabled on production instances with c63728c |
The creo catalog can let know by itself that it has 'ARCHIVED' products. As I understand, the current check will log them if they are encountered. However, something goes wrong, and 'ONLINE' products are logged instead. |
I disabled missing product check on CDSE for now. We'll need to investigate the above mentioned issue where 'ONLINE' products are logged as missing. It is also very much the question if this procedure really finds all products for which no L2A product exists. It seems that we only find the ones that have been archived, but not the ones that were never in the catalog in the first place. |
FYI: openeo-geopyspark-driver/openeogeotrellis/layercatalog.py Lines 1042 to 1044 in a9f9ab7
openeo-geopyspark-driver/openeogeotrellis/catalogs/creo.py Lines 202 to 206 in a9f9ab7
Our current implementation parses "AVAILABLE" vs "ORDERABLE" as follows openeo-geopyspark-driver/openeogeotrellis/catalogs/creo.py Lines 85 to 99 in a9f9ab7
note that this terminology is different from the "ARCHIVED" and "ONLINE" you are talking about |
So the offline products where already removed when deduplicating products with the scala code. @soxofaan Does that sound ok? Otherwise, I can also just remove the check there |
I don't completely understand what you say here, and I don't know these catalogue details well enough to be honest. I had a look at PR #584 as well and I'm confused that |
Tiles can have multiple versions, typical a difference processingBaseline. It looks like a whole bunch of old processingBaseline tiles have been archived and appear With this fix, I assume that if a tile (a location and date) has an |
Ok thanks for that explanation, I guess it's worth to put a comment about this in the implementation. what I also find confusing (again, I'm not that familiar with the creodias catalogue api), is that we pre-define these status values "available", "orderable" and "not found" here openeo-geopyspark-driver/openeogeotrellis/catalogs/base.py Lines 6 to 9 in a9f9ab7
while you talk about "offline" and "online". Is there some documented listing we (can) follow? There is this reference but that link is dead: openeo-geopyspark-driver/openeogeotrellis/catalogs/creo.py Lines 88 to 93 in a9f9ab7
|
When using SENTINEL2_L2A on CDSE, I now get warning about missing products:
[MissingProduct] Tile 'S2B_MSIL2A_20220604T104619_N0400_R051_T31UES_20220604T124954' in collection 'SENTINEL2_L2A' is not available. [MissingProduct] Tile 'S2B_MSIL2A_20220604T104619_N0400_R051_T31UFS_20220604T124954' in collection 'SENTINEL2_L2A' is not available.
this is a bit special because CDSE is the reference archive, I'm not even sure how we can implement a proper missing products check on CDSE? Can we add config to disable this check there?
The text was updated successfully, but these errors were encountered: