[Data] Async batch fetching for `map_batches` #31576

amogkam · 2023-01-10T23:20:03Z

Signed-off-by: Amog Kamsetty [email protected]

Implements batch fetching in a separate thread for GPU UDFs in map_batches. This allows CPU based batch fetching to be overlapped with the UDF computation.

prefetch_batches is added as an argument to map_batches. By default, this is set to 0.

We do not add it to DatasetContext as this functionality needs to be configured for each map_batchesindependently and not globally for the entire dataset. This is because the Dataset workflow might contain some transformations that are on GPU and others that are on CPU.

We see GPU prediction throughput increase from ~260 images/sec to ~300 images/sec:

No prefetching:

Total images 16232
Times for each stage:  {'read': 15.336565732955933, 'preprocess': 6.303653955459595, 'predict': 62.256098985672}
Throughput for each stage:  {'read': '1058.3855787948612 (img/sec)', 'preprocess': '2575.0144463341717 (img/sec)', 'predict': '260.72947493442746 (img/sec)'}
Total time:  83.89631867408752
Throughput 193.47690407080358 (img/sec)

With prefetching:

Total images 16232
Times for each stage:  {'read': 16.441548347473145, 'preprocess': 5.674700975418091, 'predict': 54.01595449447632}
Throughput for each stage:  {'read': '987.2549505043818 (img/sec)', 'preprocess': '2860.415036900528 (img/sec)', 'predict': '300.5038076603809 (img/sec)'}
Total time:  76.13220381736755
Throughput 213.20806683776962 (img/sec)

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: amogkam <[email protected]>

amogkam · 2023-01-10T23:23:13Z

python/ray/data/_internal/compute.py

-            block_bundles = [((b,), (m,)) for b, m in blocks_in]
+            block_bundles: List[
+                Tuple[Tuple[ObjectRef[Block]], Tuple[BlockMetadata]]
+            ] = [((b,), (m,)) for b, m in blocks_in]


this is a necessary change needed to get the performance improvements for batch prediction.

Before, we would only bundle blocks up to batch size and submit each bundle as a separate actor task. This means we cannot do prefetching when batch size is greater than block size since each bundle is a separate task.

Instead, if the max actor pool size is set, then we bundle up to min(batch size, max actor pool size).

Hopefully once we switch to a fully iterator-based implementation, these type of special cases are no longer necessary.

Signed-off-by: amogkam <[email protected]>

python/ray/data/_internal/block_batching.py

c21 · 2023-01-11T05:25:26Z

python/ray/data/_internal/compute.py

+        # always be less than this max_size.
+        # Otherwise, it leads to inefficiencies with creating extra actor tasks and
+        # prevents the actor task from doing optimizations such as batch or block prefetching.
+        if self.max_size and len(block_bundles) > self.max_size:


This code will become deprecated with new executor backend, cc @clarkzinzow.

c21 · 2023-01-11T05:26:56Z

python/ray/data/tests/test_block_batching.py

@@ -121,6 +123,96 @@ def test_format_batches(batch_format):
            assert isinstance(batch["foo"], np.ndarray)


+def test_async_batch_fetching():


shall we add a test for map_batches as well?

I tried it but there's too much time variance for a deterministic small-scale map_batches CI test. I'll confirm the performance improvements via running the batch inference release tests.

python/ray/train/batch_predictor.py

Signed-off-by: amogkam <[email protected]>

c21 · 2023-01-11T21:22:31Z

LGTM except one comment - #31576 (comment) . cc @clarkzinzow.

…fetching

Signed-off-by: amogkam <[email protected]>

…fetching

Signed-off-by: amogkam <[email protected]>

clarkzinzow

LGTM overall, one big thing that we need to resolve is that the actor pool rebundling will break block ordering, which I don't think we'll want to do.

python/ray/data/_internal/block_batching.py

python/ray/data/_internal/compute.py

clarkzinzow · 2023-01-13T23:03:28Z

python/ray/data/_internal/compute.py

+        if self.max_size and len(block_bundles) > self.max_size:
+
+            def chunkify(bundles: List, num_chunks: int):
+                return [bundles[i::num_chunks] for i in range(num_chunks)]


Just to make sure that I understand the motivation, this is giving us stratified chunking, where a given chunk consists of an equal number of blocks from each of the original bundles (modulo the number of chunks), right? Might be worth leaving a comment as much for those that are less familiar with this pattern.

Two potential issues with this chunking scheme:

This breaks block and therefore row ordering; the previous block bundling and actor compute strategy made sure to preserve it. This doesn't matter for batch prediction workloads but may matter for other workloads that use the actor compute strategy.

There are pathological cases of skewed blocks/bundles that could pop up. E.g. suppose we had bundles = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] (pretend the numbers are block IDs) and num_chunks = 2, and suppose that blocks with odd IDs are much larger than blocks with even IDs; this chunking would produce bundles [[1, 3, 5, 7, 9], [2, 4, 6, 8]], where the first bundle is way, way larger than the second bundle.

Could solve (1) by changing the rechunking to merge adjacent chunks without breaking ordering, but (2) would require rebundling while taking the block sizes into account. I think that (1) is probably a blocker but (2) is not, cc @matthewdeng @c21 for more opinions.

If we are only wanting to solve (1) for now, we could do the simple thing of merging adjacent bundles until we either (1) are at the specified number of chunks (max pool size), or (2) all would-be merged bundles exceed the max target block size threshold (currently 512 MiB by default).

Could do something like the following progressive merging of adjacent bundles, which should preserve block/row order:

def rebundle_to_size(bundles: list, num_bundles: int): if len(bundles) <= num_bundles: # Already done. return bundles max_bundle_size = DatasetContext.get_current().target_max_block_size # Carry out multiple rounds of merging adjacent blocks, until we have scaled down # to num_bundles bundles, or we've stopped making merging progress. while len(bundles) > num_bundles: new_bundles = [] num_merges = 0 for i in range(len(bundles) // 2): left, right = bundles[2 * i], bundles[2 * i + 1] left_size = sum(meta.size_bytes for _, meta in left) right_size = sum(meta.size_bytes for _, meta in right) if left_size + right_size <= max_bundle_size: # Merging these bundles would be under the max bundle size, so we merge them. new_bundles.append(left + right) num_merges += 1 if len(bundles) - num_merges == num_bundles: # This merging round has already brought us to the requisite number of bundles, # so we short-circuit. break else: new_bundles.extend([left, right]) if num_merges == 0: break # Add leftover bundles due to odd number of bundles or short-circuiting to new bundles. for j in range(2*i + 1, len(bundles)): new_bundles.append(bundles[i]) bundles = new_bundles return bundles

python/ray/data/_internal/block_batching.py

clarkzinzow · 2023-01-14T00:12:33Z

python/ray/data/_internal/compute.py

+        # always be less than this max_size.
+        # Otherwise, it leads to inefficiencies with creating extra actor tasks and
+        # prevents the actor task from doing optimizations
+        # such as batch or block prefetching.


A more orthogonal, future looking thing: a target bundle size that might be better than the user-provided batch_size is probably something like the following:

target_size = max( min((prefetch_batches + 1) * batch_size_in_bytes, ctx.target_min_block_size), ctx.target_max_block_size, )

I.e. where we bundle up to at least the ctx.target_min_block_size (default is 1 MiB) since that's what we consider to be the smallest "reasonable" block to make the task overhead worth it; if the number of desired concurrent batches is larger than this (e.g. batch size is larger than this and/or aggressive prefetching is specified), then we use that as a bundling target. And all of this is capped by ctx.target_max_block_size.

We'd probably still have the max actor pool size serve as a cap on the number of block bundles as well, but I'd imagine that the initial actor pool size (i.e. actors started at the beginning of execution) and the scale-up rate would be influenced by the number of block bundles.

We should experiment with a few of these hints/policies in the new execution model, and try to ensure good performance with the default configuration. cc @ericl @c21

Signed-off-by: amogkam <[email protected]>

c21 · 2023-01-19T20:01:00Z

@amogkam - can you rebase to latest master? It should fix the CI failures.

Signed-off-by: amogkam <[email protected]>

clarkzinzow · 2023-01-20T00:19:36Z

python/ray/data/dataset.py

@@ -480,6 +481,9 @@ def map_batches(
                ``pandas.DataFrame``, "pyarrow" to select ``pyarrow.Table``, or
                ``"numpy"`` to select ``numpy.ndarray`` for tensor datasets and
                ``Dict[str, numpy.ndarray]`` for tabular datasets. Default is "default".
+            prefetch_batches: The number of batches to fetch ahead of the current batch


When porting this to the new executor, we should try to consolidate prefetch_batches and prefetch_blocks into a single prefetch_batches argument, where we always prefetch enough blocks to satisfy prefetch_batches, which should be simple enough to implement since we have the size for each to-be-fetched block on hand.

python/ray/data/_internal/block_batching.py

clarkzinzow · 2023-01-20T00:23:01Z

python/ray/data/_internal/compute.py

-            block_bundles = _bundle_blocks_up_to_size(
-                blocks_in, target_block_size, name
-            )
+            total_size = sum(metadata.num_rows for _, metadata in blocks_in)


metadata.num_rows could technically be None, but shouldn't happen in practice.

updated to handle None the same way as _bundle_blocks_up_to_size

python/ray/data/_internal/compute.py

Signed-off-by: amogkam <[email protected]>

…fetching Signed-off-by: amogkam <[email protected]>

python/ray/data/_internal/compute.py

Signed-off-by: amogkam <[email protected]>

clarkzinzow

LGTM! Nice work! 👏

Signed-off-by: amogkam <[email protected]>

Signed-off-by: Amog Kamsetty [email protected] Implements batch fetching in a separate thread for GPU UDFs in map_batches. This allows CPU based batch fetching to be overlapped with the UDF computation. prefetch_batches is added as an argument to map_batches. By default, this is set to 0. We do not add it to DatasetContext as this functionality needs to be configured for each map_batchesindependently and not globally for the entire dataset. This is because the Dataset workflow might contain some transformations that are on GPU and others that are on CPU. We see GPU prediction throughput increase from ~260 images/sec to ~300 images/sec. Signed-off-by: Andrea Pisoni <[email protected]>

amogkam added 3 commits January 10, 2023 15:03

add

326376b

Signed-off-by: amogkam <[email protected]>

bundling

58d592f

Signed-off-by: amogkam <[email protected]>

batch predictor

8f53df8

Signed-off-by: amogkam <[email protected]>

amogkam requested review from ericl, scv119, clarkzinzow, jjyao, jianoaix and c21 as code owners January 10, 2023 23:20

amogkam assigned c21 and clarkzinzow Jan 10, 2023

amogkam commented Jan 10, 2023

View reviewed changes

amogkam mentioned this pull request Jan 10, 2023

[Datasets] Async Batch Prefetching Part 1/2 #30190

Closed

7 tasks

update

2952c58

Signed-off-by: amogkam <[email protected]>

c21 reviewed Jan 11, 2023

View reviewed changes

amogkam added 3 commits January 11, 2023 12:08

address comments

adb942a

Signed-off-by: amogkam <[email protected]>

update sentinel value

b761421

Signed-off-by: amogkam <[email protected]>

add tests

ddf78b8

Signed-off-by: amogkam <[email protected]>

amogkam requested a review from c21 January 11, 2023 21:21

amogkam added 4 commits January 11, 2023 19:52

Merge branch 'master' of github.com:ray-project/ray into async-batch-…

61be086

…fetching

update release test

6859012

Signed-off-by: amogkam <[email protected]>

Merge branch 'master' of github.com:ray-project/ray into async-batch-…

fe88520

…fetching

int

973fe12

Signed-off-by: amogkam <[email protected]>

clarkzinzow reviewed Jan 14, 2023

View reviewed changes

address comments

e37e46b

Signed-off-by: amogkam <[email protected]>

amogkam added 3 commits January 19, 2023 16:13

update

3cf414b

Signed-off-by: amogkam <[email protected]>

reword

9273b01

Signed-off-by: amogkam <[email protected]>

update

bc3b39c

Signed-off-by: amogkam <[email protected]>

clarkzinzow reviewed Jan 20, 2023

View reviewed changes

amogkam added 4 commits January 19, 2023 16:38

update

a5c9184

Signed-off-by: amogkam <[email protected]>

add comment

1c18dd7

Signed-off-by: amogkam <[email protected]>

cleanup

1e1a49a

Signed-off-by: amogkam <[email protected]>

Merge branch 'master' of github.com:ray-project/ray into async-batch-…

0dcd525

…fetching Signed-off-by: amogkam <[email protected]>

clarkzinzow reviewed Jan 20, 2023

View reviewed changes

python/ray/data/_internal/compute.py Outdated Show resolved Hide resolved

python/ray/data/_internal/compute.py Outdated Show resolved Hide resolved

python/ray/data/_internal/compute.py Show resolved Hide resolved

amogkam added 3 commits January 20, 2023 13:20

comment and test

adc206e

Signed-off-by: amogkam <[email protected]>

update test

2424de4

Signed-off-by: amogkam <[email protected]>

update test

346bb49

Signed-off-by: amogkam <[email protected]>

c21 approved these changes Jan 20, 2023

View reviewed changes

clarkzinzow approved these changes Jan 20, 2023

View reviewed changes

fix

ef60f11

Signed-off-by: amogkam <[email protected]>

amogkam merged commit 789232e into ray-project:master Jan 21, 2023

amogkam deleted the async-batch-fetching branch January 21, 2023 00:40

amogkam mentioned this pull request Jan 24, 2023

[data] [streaming] Support async/thread-pool batch generation for actor pool map and iter_batches() #31911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] Async batch fetching for `map_batches` #31576

[Data] Async batch fetching for `map_batches` #31576

amogkam commented Jan 10, 2023 •

edited

Loading

amogkam Jan 10, 2023

amogkam Jan 10, 2023

c21 Jan 11, 2023

c21 Jan 11, 2023

amogkam Jan 11, 2023

c21 commented Jan 11, 2023

clarkzinzow left a comment

clarkzinzow Jan 13, 2023 •

edited

Loading

clarkzinzow Jan 14, 2023 •

edited

Loading

c21 commented Jan 19, 2023

clarkzinzow Jan 20, 2023

amogkam Jan 20, 2023

clarkzinzow Jan 20, 2023

amogkam Jan 20, 2023

clarkzinzow left a comment

		@@ -121,6 +123,96 @@ def test_format_batches(batch_format):
		assert isinstance(batch["foo"], np.ndarray)


		def test_async_batch_fetching():

[Data] Async batch fetching for map_batches #31576

[Data] Async batch fetching for map_batches #31576

Conversation

amogkam commented Jan 10, 2023 • edited Loading

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c21 commented Jan 11, 2023

clarkzinzow left a comment

Choose a reason for hiding this comment

clarkzinzow Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

clarkzinzow Jan 14, 2023 • edited Loading

Choose a reason for hiding this comment

c21 commented Jan 19, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clarkzinzow left a comment

Choose a reason for hiding this comment

[Data] Async batch fetching for `map_batches` #31576

[Data] Async batch fetching for `map_batches` #31576

amogkam commented Jan 10, 2023 •

edited

Loading

clarkzinzow Jan 13, 2023 •

edited

Loading

clarkzinzow Jan 14, 2023 •

edited

Loading