Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cast SparseTensor and FixedShapeSparseTensor to Python #3009

Closed
sagiahrac opened this issue Oct 7, 2024 · 1 comment
Closed

Cast SparseTensor and FixedShapeSparseTensor to Python #3009

sagiahrac opened this issue Oct 7, 2024 · 1 comment

Comments

@sagiahrac
Copy link
Contributor

sagiahrac commented Oct 7, 2024

Current Behavior
Currently, reading a sparse tensor requires converting it to a dense representation (e.g., Tensor or FixedShapeTensor), or using the default to_arrow implementation for physical struct arrays. At the moment there is no way to iterate directly over daft dataframe without converting the sparse columns to dense representation. This feature would enable native casting to Python, decoupled from future changes to sparse datatypes.

Proposed Solution
The desired solution is to allow iteration over a Daft DataFrame containing sparse tensor columns without the need for conversion.

Alternative Approaches
An alternative is to use the to_arrow method for casting to Python by default for physical struct arrays. This assumes that the struct inner fields of these types are relevant for the end user.

Additional context
A PR implementing this feature is attached.

colin-ho pushed a commit that referenced this issue Oct 8, 2024
Addresses: #3009

This PR enables casting of SparseTensor and FixedShapeSparseTensor to
Python, allowing iteration over Daft DataFrames with sparse tensor
columns without converting to dense formats.
@colin-ho
Copy link
Contributor

colin-ho commented Oct 9, 2024

Closed with #3010

@colin-ho colin-ho closed this as completed Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants