[JavaScript] Allow index access for `Table`, `RecordBatch`, and `Vector` #34936

shuhaowu · 2023-04-06T16:29:32Z

Describe the enhancement requested

Certain codebases that previously uses row-oriented way to access data may wish to migrate to Arrow to save serialization and deserialization cost, and to be able to gain access to fast column-oriented operations. As it stands, Arrow is sort of a drop-in replacement to row-oriented data such as a JavaScript Array of objects. This is great to incrementally migrate legacy codebases to Arrow, as it is frequently infeasible to rewrite the application to use the column-oriented data access patterns. For most data, JavaScript-object-compatible and row-oriented access is already provided via the StructRowProxy. However, if the structs themselves include a Vector, existing code will break as it assumes the Vector object to behave like a JavaScript array, which it does not due to the lack of index access. An example of such a data structure is as follows:

[
  {x: 1, y: [1, 2]},
  {x: 2, y: [2, 3]},
]

In this case, with the Arrow JS library as it is, the API consumer is unable to get individual element of the y array via table[i].y[j]. Instead, the API consumer must use the API table.get(i).y.get(j). In the situation where we are migrating a legacy code base to Arrow, this requires a large refactor of the entire codebase, which is infeasible in a short time. This negates the advantage of using Arrow as a drop-in replacement and prevents incremental migration of code to Arrow.

Arrow should provide index access for the Vector object, as well as Table and RecordBatch for backward compatibility with JavaScript arrays.

Component(s)

JavaScript

The text was updated successfully, but these errors were encountered:

shuhaowu added the Type: enhancement label Apr 6, 2023

github-actions bot added the Component: JavaScript label Apr 6, 2023

github-actions bot linked a pull request Apr 6, 2023 that will close this issue

GH-34936: [JavaScript] Added Proxy for Table, RecordBatch, and Vector #34939

Draft

2 tasks

github-actions bot assigned shuhaowu Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JavaScript] Allow index access for `Table`, `RecordBatch`, and `Vector` #34936

[JavaScript] Allow index access for `Table`, `RecordBatch`, and `Vector` #34936

shuhaowu commented Apr 6, 2023

[JavaScript] Allow index access for Table, RecordBatch, and Vector #34936

[JavaScript] Allow index access for Table, RecordBatch, and Vector #34936

Comments

shuhaowu commented Apr 6, 2023

Describe the enhancement requested

Component(s)

[JavaScript] Allow index access for `Table`, `RecordBatch`, and `Vector` #34936

[JavaScript] Allow index access for `Table`, `RecordBatch`, and `Vector` #34936