Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/parquet: add compression options #102978

Merged
merged 1 commit into from
May 10, 2023

Conversation

jayshrivastava
Copy link
Contributor

@jayshrivastava jayshrivastava commented May 9, 2023

This change updates the parquet writer to be able to use
GZIP, ZSTD, SNAPPY, and BROTLI compression codecs. By
default, no compression is used. LZO and LZ4 are unsupported
by the library.

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-15071
Informs: #99028
Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jayshrivastava jayshrivastava marked this pull request as ready for review May 9, 2023 20:49
var compressionCodecToParquet = map[CompressionCodec]compress.Compression{
CompressionNone: compress.Codecs.Uncompressed,
CompressionGZIP: compress.Codecs.Gzip,
CompressionZSTD: compress.Codecs.Zstd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these the only 2 supported by parquet? no snappy? no lz4?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	Uncompressed: Compression(parquet.CompressionCodec_UNCOMPRESSED),
	Snappy:       Compression(parquet.CompressionCodec_SNAPPY),
	Gzip:         Compression(parquet.CompressionCodec_GZIP),
	Lzo:          Compression(parquet.CompressionCodec_LZO),
	Brotli:       Compression(parquet.CompressionCodec_BROTLI),
	Lz4:          Compression(parquet.CompressionCodec_LZ4),
	Zstd:         Compression(parquet.CompressionCodec_ZSTD),

This change updates the parquet writer to be able to use
GZIP, ZSTD, SNAPPY, and BROTLI compression codecs. By
default, no compression is used. LZO and LZ4 are unsupported
by the library.

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-15071
Informs: cockroachdb#99028
Release note: None
@jayshrivastava
Copy link
Contributor Author

bors r=miretskiy

@jayshrivastava jayshrivastava mentioned this pull request May 10, 2023
13 tasks
@craig
Copy link
Contributor

craig bot commented May 10, 2023

Build succeeded:

@craig craig bot merged commit a833450 into cockroachdb:master May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants