-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How should metadata be written in a partitioned dataset? #79
Comments
Adding here for discussion from #101 (comment); we should clarify what the bounding box represents. Is it possible for each file's metadata to contain only its bounding box, but for the |
Where do we stand on this in relation latest discussion pushing for stac 1.0.0-beta.1 sooner rather than later. Should we attempt to put something in there? Has anyone experimented with this and have a good recommendation here? |
Do we want to do something here for 1.1? We've now got 'spatial optimizations' ready to go for 1.1, so it seems like a good time to get this to completion to. It sounds like libraries do make use of the
Do we want to explicitly say that the _metadata sidecar sets the extent of the whole dataset? Or just leave things as they are? |
I don't think there's an API (at least in pyarrow) to manually set the geoparquet metadata on the |
Discussed on call 6/3/24 - this should be in best practices, and is not dependent on the specification, so moved off the 1.1 release. We hope to do best practices soon, perhaps make a 'milestone' and push on the things we want there soon after 1.1, but not block release on it. |
So far the spec has only covered single-file Parquet data. However Parquet also supports saving as a "dataset", where there are several Parquet files in a folder structure. In this case, how should geospatial metadata be stored? There's a Parquet best practice that writes
_common_metadata
and_metadata
sidecar files to the root of the folder structure, but that's not part of the actual Parquet specification.If I understand correctly, the
geo
metadata would automatically be included in the_common_metadata
file, and additionally statistics are stored in the_metadata
file, which is relevant for #13Should this be part of the geoparquet spec? Should it be a "best practice" that we document?
The text was updated successfully, but these errors were encountered: