Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for ZSTD encoding in schema-ddl #262

Closed
wants to merge 2 commits into from

Conversation

miike
Copy link

@miike miike commented Aug 24, 2017

This PR adds support for ZSTD encoding (#237)

  • The first commit modifies the default for VARCHAR (and TIMESTAMP) columns to preferentially now use ZSTD over LZO for these column types. It doesn't yet override suggestions for other data types though this would likely be useful. I'm not sure if this is something we'd like to test before going down this route (e.g., how well ZSTD functions on INTs that are uniformly distributed, normally distributed etc).

  • The second commit modifies the encodings on the self and parent columns from RUNLENGTH to ZSTD. For columns that primarily contain one value (such as schema_vendor) this change makes little difference to space on disk but for other columns, such as schema_version this can make a significant difference. In a small sample of ~10 million rows if two schemas are in use simultaneously and we make the assumption that the schema used is independent and identically distributed (over time) this column is approximated 2/3 of the size on disk using ZSTD when compared to RUNLENGTH.

@BenFradet BenFradet requested a review from chuwy August 25, 2017 09:11
@alexanderdean alexanderdean requested review from chuwy and removed request for chuwy August 25, 2017 11:07
@alexanderdean
Copy link
Member

Thanks @miike! To @chuwy for review...

@snowplowcla
Copy link

@miike has signed the Software Grant and Corporate Contributor License Agreement

Copy link
Contributor

@chuwy chuwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@miike
Copy link
Author

miike commented Sep 6, 2017

Thanks @chuwy!

@chuwy chuwy added this to the Release 8 Stamp TBC milestone Dec 15, 2017
@oguzhanunlu oguzhanunlu removed this from the Release 8 Basel Dove milestone Dec 18, 2017
@oguzhanunlu
Copy link
Member

Cherry-picked to #309, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants