Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partial serde support for ParquetWriterOptions #8627

Merged
merged 11 commits into from
Dec 23, 2023

Conversation

andygrove
Copy link
Member

@andygrove andygrove commented Dec 22, 2023

Which issue does this PR close?

Closes #8598
Closes #8619

There is a follow on issue #8632 to support advanced writer options, but this should be enough to unblock Ballista.

If this PR is accepted then I will create a similar PR to do the same for CSV.

Rationale for this change

Required by Ballista so that we can save query results to Parquet.

What changes are included in this PR?

  • Implement serde
  • Add tests

Are these changes tested?

Are there any user-facing changes?

@andygrove andygrove changed the title WIP: Add serde support for ParquetWriterOptions Add partial serde support for ParquetWriterOptions Dec 22, 2023
@andygrove andygrove marked this pull request as ready for review December 22, 2023 18:41
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @andygrove

}

message WriterProperties {
int32 data_page_size_limit = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the rust fields are usize, is there any reason to use in32 in the protobuf encoding?

Maybe it would make more sense to use i64 or u64 here instead 🤔

https://protobuf.dev/programming-guides/proto2/#scalar

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have updated to u32.

Copy link
Member Author

@andygrove andygrove Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I misread / misunderstood your comment. I changed to u32 on the basis that these should not need to support negative numbers, but I may need to research this more to make sure that is the case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, in Rust they are usize, so can't be negative.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now changed to u64 to better match usize.

@andygrove andygrove merged commit bf43bb2 into apache:main Dec 23, 2023
23 checks passed
@andygrove andygrove deleted the parquet-writer-props-serde branch December 23, 2023 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add serde support for CopyTo with WriterOptions Add serde support for Parquet FileTypeWriterOptions
2 participants