-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C#] Add LargeBinary, LargeString and LargeList array types #43266
Comments
take |
This was referenced Jul 15, 2024
CurtHagenlocher
pushed a commit
that referenced
this issue
Jul 19, 2024
…#43269) ### Rationale for this change See #43266. Note that LargeBinary and LargeString are still limited to 2 GiB buffers, and LargeList is limited to offsets that can be represented as int32. ### What changes are included in this PR? * Add new Array subtypes: LargeBinaryArray, LargeStringArray and LargeListArray * Support round-tripping these array types via the IPC format * Support round-tripping these array types via the C Data Interface * Improve error messages when importing arrays that are too large via IPC or C Data Interface * Enable integration tests for the new types * Update documentation ### Are these changes tested? Yes, I've added some basic tests specifically for the new array types, and added these to the test data generator so they're covered by the existing tests for round tripping using IPC and C Data Interface. ### Are there any user-facing changes? Yes, this is a new user facing feature. ### Implementation notes * I haven't added builders for these new array types. Given they're added to help with interoperability with other libraries, I wouldn't expect .NET users to build arrays of these types as they provide no other benefit over the non-large types until we have proper large memory support. But I'm happy to add this if it would be useful. * The new array types share a lot of logic with the non-large types. I considered trying to consolidate this logic by adding a new `BinaryArrayBase<TOffset>` class for example, but I think this would require generic math support to work nicely, and would still complicate the code quite a bit and add extra virtual method call overhead. So I think it's fine to keep these new Array subtypes independent from the non-large types. * I haven't included support for materializing a LargeStringArray (see #41048). I'm not sure whether there would be a use for this, but it could be added later if needed. * GitHub Issue: #43266 Authored-by: Adam Reeve <[email protected]> Signed-off-by: Curt Hagenlocher <[email protected]>
Issue resolved by pull request 43269 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
This is a subtask of #34736
These arrays will not be able to support value buffers > 2 GB until #23776 is addressed, but there's some value in supporting these types anyway, to simplify integration with other Arrow implementations.
See #34736 (comment) for some more context.
Component(s)
C#
The text was updated successfully, but these errors were encountered: