[Datasets] [Pandas Block] Implement PandasBlockAccessor in pandas-native ways #21296
Closed
2 tasks done
Labels
data
Ray Data-related issues
enhancement
Request for new feature and/or capability
P1
Issue that should be fixed within a few weeks
Milestone
Search before asking
Description
#20988 Introduces a Pandas block format support in Ray Dataset. But Some methods of
PandasBlockAccessor
are implemented by converting to and from Arrow format. The performance may be not as good enough as the pandas-native way. We need to re-implement them.Interfaces to be implemented:
sort_and_partition
combine
merge_sorted_blocks
aggregate_combined_blocks
Use case
No response
Related issues
#20719
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: