forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
apacheGH-37829: [Java] Avoid resizing data buffer twice when appendin…
…g variable length vectors (apache#37844) ### Rationale for this change This change prevents avoidable `OversizedAllocationException`s when appending a variable-length vector with many small elements to a variable-length vector with a few large elements. When appending variable-length vectors, `VectorAppender` iteratively doubles the offset and validity buffers until they can accommodate the combined elements. In the previous implementation, each iteration would also double the data buffer's capacity. This behavior is appropriate for vectors of fixed-size types but can result in an oversized data buffers when appending many small elements to a variable length vector with a large data buffer. ### What changes are included in this PR? The new behavior only resizes the offset and validity buffers when resizing the target vector's buffers to ensure they can hold the total number of combined elements. The data buffer is resized based on the total required data size of the combined elements. ### Are these changes tested? Yes. I added a unit test that results in an `OversizedAllocationException` when run against the previous version of the code. ### Are there any user-facing changes? No. * Closes: apache#37829 Authored-by: hrishisd <[email protected]> Signed-off-by: David Li <[email protected]>
- Loading branch information
1 parent
35940fc
commit 9bb7d7d
Showing
3 changed files
with
29 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters