GeoParquet 1.0.0 beta.1 and the path to GeoParquet 1.0.0 #122
Replies: 3 comments 2 replies
-
Should the title be |
Beta Was this translation helpful? Give feedback.
-
One question related to actual versioning of files: if we "release" a 1.0.0.beta1 next month, do we expect that implementors actually write this version into the parquet files? Or do we still release for example a 0.5.0 which is equal in specification to 1.0.0beta1, so you don't have "beta" written in a file? |
Beta Was this translation helpful? Give feedback.
-
Thanks chris for the effort. Right now nothing else to add. |
Beta Was this translation helpful? Give feedback.
-
Earlier today a group of GeoParquet contributors met to talk about the path to 1.0.0 and made a few decisions, though we welcome further discussion here.
The main decision was to focus on 'interoperability' for version 1.0.0, to push ahead with a simple, stable specification that helps any existing user of geospatial data in parquet to have a clear, stable target to rely upon. We still plan to explore more advanced topics like spatial indexing, alternate geometry representations and 'streaming' use cases, but those will be either a version 1.1 or a 2.0 (we'll use SemVer, so we hope to do it as a 1.1, but if there are breaking changes it'll be a 2.0).
In line with that we'll keep the encoding field, but for 1.0 it will only have 1 available option - WKB. This allows us to introduce an alternate encoding in the future without having it be a breaking change, but we're not trying to get in a new format for version 1.0. We'll aim to closely track GeoArrow Spec, and when it matures and releases a 1.0 we'll likely make a new GeoParquet release. We talked about also keeping open in the GeoParquet repo a pull request that describes a Arrow Native / columnar geometry format, linking to the geoarrow description for continued experimentation.
Practically this is an update to our articulated roadmap, as we won't do a 'spatial index'-focused release, and we'd see work on that coming after 1.0.
The plan is to work to a 1.0.0-beta.1 release within a month. This release will mean 'we don't think we'll be making any changes before 1.0.0, but we want to get a wider set of feedback and will consider any changes that come during that time - including breaking changes'. But we should be able to get a wider group of people implementing, since the changes between 1.0.0-beta.1 and 1.0.0 will hopefully be minimal.
To get to 1.0.0 we'll aim to satisfy a set of criteria (listed below) that represents broad community acceptance and will mean that we've gotten sufficient feedback to feel comfortable with a 1.0.0 release that will be stable. Once the criteria are hit we'll issue a 1.0.0-rc.1 - a true 'release candidate' where we'll leave it out for a few weeks. If no one has any changes then that version will become 1.0.0. If there are any fixes (even minor ones) to the spec itself we'll issue a next release candidate and repeat the process.
The group discussed the 'criteria' for a bit, and came up with the following:
That should be the core set of requirements for 1.0.0. And then there are a few other activities that people in the group will likely work on, but that aren't hard requirements to get to 1.0.0:
The very next steps are to organize the issue tracker to represent the milestones, and then to work through the issues for 1.0.0-beta.1. I hope to take on the organization in the next day or two, and everyone is welcome and encouraged to help triage and make PR's for the remaining issues. We'll likely have another synchronous meeting on 10/7 to hopefully finish off any needed discussion from the remaining issues, and hope to cut the release soon after. It'll be at 9am pacific time.
Comments/feedback welcome, especially on the overall plan. If you're wanting to go deep on a particular issue mentioned here then it'd likely be better to open a discussion or issue and discuss there.
Beta Was this translation helpful? Give feedback.
All reactions