Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-32240: [C#] Add new Apache.Arrow.Compression package to implement IPC decompression #33893
GH-32240: [C#] Add new Apache.Arrow.Compression package to implement IPC decompression #33893
Changes from 4 commits
c563062
0621d6a
72f1c16
6b5145e
16e944f
177dc43
48a9e9d
77f5feb
a80c855
fccfbe0
6798721
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
CommunityToolkit.HighPerformance
dependency is only needed for theAsStream
extension methods onMemory<byte>
andReadOnlyMemory<byte>
which could be re-implemented if we wanted to minimize extra dependencies.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that the LZ4 library offers APIs that takes Spans. We should just use those APIs instead, and then we can get rid of this dependency.
https://github.com/MiloszKrajewski/K4os.Compression.LZ4/blob/fa8b8e038b500d565efe12769db097852a28ddf7/src/K4os.Compression.LZ4/LZ4Codec.cs#L139-L144
Note that there are 2 LZ4 libraries - one for streams (which this pr is using now) and one that doesn't use streams - https://www.nuget.org/packages/K4os.Compression.LZ4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two libraries actually implement different formats. The
K4os.Compression.LZ4.Streams
library implements the LZ4 frame format, and theK4os.Compression.LZ4
library implements the LZ4 block format. Arrow IPC uses the frame format (https://github.com/apache/arrow/blob/apache-arrow-11.0.0/format/Message.fbs#L46-L48), so we need to use the streams library.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened MiloszKrajewski/K4os.Compression.LZ4#79 to request this is added to the
K4os.Compression.LZ4
library.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Eric, your issue prompted me to dig more into the
K4os
library though and I realised that since I initially started working on this, a new API has been added that allows using the frame format with more types than justStream
, so I've switched to this and removed theCommunityToolkit.HighPerformance
dependency.