You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current Behavior
Currently, reading a sparse tensor requires converting it to a dense representation (e.g., Tensor or FixedShapeTensor), or using the default to_arrow implementation for physical struct arrays. At the moment there is no way to iterate directly over daft dataframe without converting the sparse columns to dense representation. This feature would enable native casting to Python, decoupled from future changes to sparse datatypes.
Proposed Solution
The desired solution is to allow iteration over a Daft DataFrame containing sparse tensor columns without the need for conversion.
Alternative Approaches
An alternative is to use the to_arrow method for casting to Python by default for physical struct arrays. This assumes that the struct inner fields of these types are relevant for the end user.
Additional context
A PR implementing this feature is attached.
The text was updated successfully, but these errors were encountered:
Addresses: #3009
This PR enables casting of SparseTensor and FixedShapeSparseTensor to
Python, allowing iteration over Daft DataFrames with sparse tensor
columns without converting to dense formats.
Current Behavior
Currently, reading a sparse tensor requires converting it to a dense representation (e.g., Tensor or FixedShapeTensor), or using the default to_arrow implementation for physical struct arrays. At the moment there is no way to iterate directly over daft dataframe without converting the sparse columns to dense representation. This feature would enable native casting to Python, decoupled from future changes to sparse datatypes.
Proposed Solution
The desired solution is to allow iteration over a Daft DataFrame containing sparse tensor columns without the need for conversion.
Alternative Approaches
An alternative is to use the to_arrow method for casting to Python by default for physical struct arrays. This assumes that the struct inner fields of these types are relevant for the end user.
Additional context
A PR implementing this feature is attached.
The text was updated successfully, but these errors were encountered: