-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] Support Hudi reader #2011
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2011 +/- ##
==========================================
+ Coverage 84.72% 84.96% +0.24%
==========================================
Files 62 68 +6
Lines 6840 7271 +431
==========================================
+ Hits 5795 6178 +383
- Misses 1045 1093 +48
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems pretty good to me overall!
We're happy to keep code in Daft for now in the daft/hudi/pyhudi
folder, but we should probably chat about how you envision the split and API between our libraries.
Couple of important things we should verify:
- Does this work with S3? We should try reading a Hudi table from AWS S3 and make sure that works. We do have some examples of integration tests using minio too that you can look at if you want a locally runnable example (see:
tests/integration/iceberg
) - Does this work with the full range of Hudi's typesystem?
- We should also throw an error early if there features we don't support and detect that early (e.g. merge-on-read)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!! 🚀 🚀 🚀 🚀
for #2070