Java: Make ColumnVector.fromViewWithContiguousAllocation public #16784

jlowe · 2024-09-10T16:29:14Z

Description

Exposes ColumnVector's fromViewWithContiguousAllocation method so code outside of cudf that builds contiguous table views can expose those columns in Java.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

ttnghia · 2024-09-10T17:07:39Z

java/src/main/java/ai/rapids/cudf/ColumnVector.java

+   * Creates a ColumnVector from a native column_view using a contiguous device allocation.
+   *
+   * @param columnViewAddress address of the native column_view
+   * @param buffer device buffer containing the data referenced by the column view
+   */
+  public static ColumnVector fromViewWithContiguousAllocation(long columnViewAddress, DeviceMemoryBuffer buffer) {


What is the role that a "contiguous allocation" plays here? I see that it is just creating a ColumnVector from device buffer and nothing else, so not sure what I'm missing here.

Contiguous allocation meaning all of the underlying buffers from this column view (i.e.: validity, offsets, data, and recursively from child columns) comes from this single buffer. Normally these things are allocated separately and thus are not from a single buffer. The way contiguous tables work in Java is that we build up ColumnVector instances from views (zero-copy, so not like column vs. column_view in native) that reference the underlying buffer (i.e.: via incRefCount). The buffer is then closed, decrementing the refcount, but it stays allocated because of all the columns referencing it. When the last column referencing the buffer is finally closed, the buffer is closed when its refcount goes to zero.

So we need the buffer in this method for the reference semantics, and that buffer is the contiguous allocation holding this column (and probably others).

ttnghia · 2024-09-16T22:04:43Z

/merge

Java: Make ColumnVector.fromViewWithContiguousAllocation public

2fe0438

jlowe added Java Affects Java cuDF API. Spark Functionality that helps Spark RAPIDS improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 10, 2024

jlowe self-assigned this Sep 10, 2024

jlowe requested a review from a team as a code owner September 10, 2024 16:29

abellina approved these changes Sep 10, 2024

View reviewed changes

ttnghia reviewed Sep 10, 2024

View reviewed changes

jlowe mentioned this pull request Sep 10, 2024

Add HostTable interface to allow wielding of host tables in native code NVIDIA/spark-rapids-jni#2393

Open

rapids-bot bot merged commit 4033385 into rapidsai:branch-24.10 Sep 16, 2024
81 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Java: Make ColumnVector.fromViewWithContiguousAllocation public #16784

Java: Make ColumnVector.fromViewWithContiguousAllocation public #16784

jlowe commented Sep 10, 2024

ttnghia Sep 10, 2024

jlowe Sep 10, 2024

ttnghia commented Sep 16, 2024

Java: Make ColumnVector.fromViewWithContiguousAllocation public #16784

Java: Make ColumnVector.fromViewWithContiguousAllocation public #16784

Conversation

jlowe commented Sep 10, 2024

Description

Checklist

ttnghia Sep 10, 2024

Choose a reason for hiding this comment

jlowe Sep 10, 2024

Choose a reason for hiding this comment

ttnghia commented Sep 16, 2024