[data] [streaming] Support async/thread-pool batch generation for actor pool map and iter_batches() #31911
Labels
data
Ray Data-related issues
enhancement
Request for new feature and/or capability
P1
Issue that should be fixed within a few weeks
Ray 2.4
Milestone
#31576 is only implemented in the old backend.
For the new backend, we should make sure to pipeline/asynchronously compute the batches within the actor workers in a separate thread.
We should also enable this optimization for
iter_batches()
, and in particular use a thread-pool to accelerate batch conversions with additional parallelism.cc @amogkam
The text was updated successfully, but these errors were encountered: