You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.
Maybe a similar api to how we pass encodings into RowGroupIterator.
This will allow to have different compression config for different columns. It would be very useful in cases where we have a sizeable column with random binary data like hash etc. Or if we are using rle/dictionary encoding, there might not be much point in compressing/decompressing.
This would give significant performance boost for my use case since when I look at timings for querying parquet, it shows 1/4. 1/2 of time is spent decompressing
I would like to work on this if I can get how I should modify the public api for this
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Maybe a similar api to how we pass encodings into RowGroupIterator.
This will allow to have different compression config for different columns. It would be very useful in cases where we have a sizeable column with random binary data like hash etc. Or if we are using rle/dictionary encoding, there might not be much point in compressing/decompressing.
This would give significant performance boost for my use case since when I look at timings for querying parquet, it shows 1/4. 1/2 of time is spent decompressing
I would like to work on this if I can get how I should modify the public api for this
The text was updated successfully, but these errors were encountered: