[AIR] Execute GPU inference in a separate stage in BatchPredictor #26616

ericl · 2022-07-15T22:54:13Z

Why are these changes needed?

This PR is stacked on #26600

Signed-off-by: Eric Liang <[email protected]>

amogkam · 2022-07-16T22:44:32Z

python/ray/train/batch_predictor.py

@@ -231,6 +253,9 @@ def predict_pipelined(
            max_scoring_workers: If set, specify the maximum number of scoring actors.
            num_cpus_per_worker: Number of CPUs to allocate per scoring worker.
            num_gpus_per_worker: Number of GPUs to allocate per scoring worker.
+            separate_gpu_stage: If using GPUs, specifies whether to execute GPU


Is there a case when we would not want this enabled?

If the preprocessor is very lightweight, then enabling this could hurt more than it helps, due to excess disk spilling (in non-pipelined mode).

Signed-off-by: Eric Liang <[email protected]>

Signed-off-by: Matthew Deng <[email protected]>

Signed-off-by: Eric Liang <[email protected]>

amogkam · 2022-07-19T23:56:15Z

python/ray/train/batch_predictor.py

+        if num_gpus_per_worker is None:
+            num_gpus_per_worker = 0
+        if num_cpus_per_worker is None:
+            if num_gpus_per_worker > 0:


should we be doing this across all of our library code 😅

Yeah, it's probably a good idea.

amogkam · 2022-07-19T23:57:38Z

python/ray/train/batch_predictor.py

+        if separate_gpu_stage and num_gpus_per_worker > 0:
+            preprocessor = self.get_preprocessor()
+            if preprocessor:
+                override_prep = BatchMapper(lambda x: x)


doesnt seem like override_prep is being used anywhere?

Oh, that's weird LINT didn't catch it.

Oh it's actually a local variable from the previous PR.

I think there might be a set_preprocessor line missing to disable the preprocessor for the predictor.

It's actually correct--- I modified the unit test. If you remove this line, the test will fail. Also added a clarifying comment.

Signed-off-by: Eric Liang <[email protected]>

python/ray/train/batch_predictor.py

Co-authored-by: matthewdeng <[email protected]>

Signed-off-by: Eric Liang <[email protected]>

…y-project#26616) Signed-off-by: Rohan138 <[email protected]>

…y-project#26616) Signed-off-by: Stefan van der Kleij <[email protected]>

ericl added 4 commits July 15, 2022 11:31

update prep

e2c0a97

Update batch_predictor.py

4da71b1

Update batch_predictor.py

a125227

update

e460b6b

Signed-off-by: Eric Liang <[email protected]>

ericl added the do-not-merge Do not merge this PR! label Jul 15, 2022

ericl mentioned this pull request Jul 16, 2022

[AIR] Update Torch benchmarks with documentation #26631

Merged

6 tasks

amogkam reviewed Jul 16, 2022

View reviewed changes

ericl and others added 18 commits July 16, 2022 17:31

Update predictor.py

291857f

Merge remote-tracking branch 'upstream/master' into update-prep

e12b85f

Signed-off-by: Eric Liang <[email protected]>

Merge branch 'update-prep' of github.com:ericl/ray into update-prep

6fdde42

Update batch_predictor.py

37f60b4

Merge branch 'update-prep' into gpu-stage

47ae00e

Merge branch 'gpu-stage' of github.com:ericl/ray into gpu-stage

f585d98

update

4b0fc90

Signed-off-by: Eric Liang <[email protected]>

add Checkpoint.get_preprocessor

668f564

Signed-off-by: Matthew Deng <[email protected]>

update predictor implementations to use new API

e2dd7c6

Signed-off-by: Matthew Deng <[email protected]>

add tests

2a1ca7b

Signed-off-by: Matthew Deng <[email protected]>

add method docs

b7eec56

Signed-off-by: Matthew Deng <[email protected]>

fix attr

5fd7833

Signed-off-by: Matthew Deng <[email protected]>

fix tests

f781fc8

Signed-off-by: Matthew Deng <[email protected]>

fix tests

4833b3e

Signed-off-by: Matthew Deng <[email protected]>

lint

ac1ce46

Signed-off-by: Matthew Deng <[email protected]>

Merge branch 'update-prep' into gpu-stage

bdc4538

Merge remote-tracking branch 'upstream/master' into gpu-stage

ad01f9b

Merge remote-tracking branch 'upstream/master' into gpu-stage

a8f0758

ericl changed the title ~~[WIP] [AIR] Execute GPU inference in a separate stage in BatchPredictor~~ [AIR] Execute GPU inference in a separate stage in BatchPredictor Jul 19, 2022

ericl added 2 commits July 19, 2022 16:46

update

72408b1

Signed-off-by: Eric Liang <[email protected]>

add test

8c1140f

Signed-off-by: Eric Liang <[email protected]>

ericl removed the do-not-merge Do not merge this PR! label Jul 19, 2022

ericl assigned matthewdeng Jul 19, 2022

ericl assigned amogkam Jul 19, 2022

amogkam approved these changes Jul 19, 2022

View reviewed changes

ericl added 3 commits July 19, 2022 16:58

move test

b8541dd

Signed-off-by: Eric Liang <[email protected]>

test correctness

4382bf9

Signed-off-by: Eric Liang <[email protected]>

clarify

e309c67

Signed-off-by: Eric Liang <[email protected]>

matthewdeng reviewed Jul 20, 2022

View reviewed changes

python/ray/train/batch_predictor.py Outdated Show resolved Hide resolved

ericl and others added 2 commits July 19, 2022 17:26

Update python/ray/train/batch_predictor.py

475854b

Co-authored-by: matthewdeng <[email protected]>

fix the tests

f6265ad

Signed-off-by: Eric Liang <[email protected]>

matthewdeng approved these changes Jul 20, 2022

View reviewed changes

ericl merged commit 0f6beca into ray-project:master Jul 20, 2022

Rohan138 pushed a commit to Rohan138/ray that referenced this pull request Jul 28, 2022

[AIR] Execute GPU inference in a separate stage in BatchPredictor (ra…

a8f305b

…y-project#26616) Signed-off-by: Rohan138 <[email protected]>

Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022

[AIR] Execute GPU inference in a separate stage in BatchPredictor (ra…

4ce0d08

…y-project#26616) Signed-off-by: Stefan van der Kleij <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIR] Execute GPU inference in a separate stage in BatchPredictor #26616

[AIR] Execute GPU inference in a separate stage in BatchPredictor #26616

ericl commented Jul 15, 2022 •

edited

Loading

amogkam Jul 16, 2022

ericl Jul 16, 2022

amogkam Jul 19, 2022

ericl Jul 19, 2022

amogkam Jul 19, 2022

ericl Jul 19, 2022

ericl Jul 20, 2022

amogkam Jul 20, 2022

ericl Jul 20, 2022

[AIR] Execute GPU inference in a separate stage in BatchPredictor #26616

[AIR] Execute GPU inference in a separate stage in BatchPredictor #26616

Conversation

ericl commented Jul 15, 2022 • edited Loading

Why are these changes needed?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericl commented Jul 15, 2022 •

edited

Loading