Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyIceberg Near-Term Roadmap #736

Open
18 of 39 tasks
kevinjqliu opened this issue May 14, 2024 · 7 comments
Open
18 of 39 tasks

PyIceberg Near-Term Roadmap #736

kevinjqliu opened this issue May 14, 2024 · 7 comments

Comments

@kevinjqliu
Copy link
Contributor

kevinjqliu commented May 14, 2024

Feature Request / Improvement

PyIceberg 0.7.0

The main objective of 0.7.0 is to have partitioned writes (non-exhaustive list :)

PyIceberg 0.8.0

PyIceberg 1.0.0

Long-term goals:

@Fokko Fokko pinned this issue May 14, 2024
@corleyma
Copy link

@kevinjqliu @Fokko Where would something like the Iceberg Spark create_changelog_view procedure fit in this roadmap? Is that something that might be tackled as part of the other procedures under table maintenance, or is it likely to come later (1.0.0), or not at all in PyIceberg?

@Fokko
Copy link
Contributor

Fokko commented May 23, 2024

Sorry for the late reply, I was touching grass.

@kevinjqliu @Fokko Where would something like the Iceberg Spark create_changelog_view procedure fit in this roadmap? Is that something that might be tackled as part of the other procedures under table maintenance, or is it likely to come later (1.0.0), or not at all in PyIceberg?

Thanks for bringing this up @corleyma 🙌 Some related work is being done in #533 and I think PyIceberg should definitely support something like that.

@kevinjqliu @Fokko where would something like #402 go?

I've added it to the overview. Once the partial deletes + partitioned writes are in, this is supported automatically. We might want to have some community discussion on the API once those two PRs land.

@tusharchou
Copy link

@Fokko can we add issues for creating tests and documentation for the new features of 0.7.0 as good first issues?

@MehulBatra
Copy link
Contributor

MehulBatra commented Jun 2, 2024

@Fokko can we add issues for creating tests and documentation for the new features of 0.7.0 as good first issues?

@tusharchou: Whenever you create a new feature, you need to add the unit & integration test and make the necessary changes in mkdocs as a part of that PR, but if you feel like there are some missing parts, please feel free to raise an improvement/issue and we can discuss that in the python syncup.

@jaehyeon-kim
Copy link

jaehyeon-kim commented Oct 14, 2024

It looks BigLake metastore is going to be replaced with BigQuery metastore. Is the version 0.8.0 roadmap still up-to-date?

trinodb/trino#20031 (comment)

@anoopj
Copy link

anoopj commented Oct 16, 2024

@jaehyeon-kim That is correct. BigQuery Metastore is the replacement for BigLake Metastore. I recommend adjusting the roadmap to skip BigLake metastore and add support for BigQuery Metastore. This PR to the Iceberg Java libraries should be good reference.

@kevinjqliu
Copy link
Contributor Author

Thanks for the context @anoopj. @jaehyeon-kim looks like #651 is a feature request. There's currently no committed date to implement it, I'll readjust the roadmap to reflect that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants