[RLlib] 2 fixes for 2.10 release: 1) DDP + GPU fix; 2) py3.8 fix for uniting two dicts #44000

sven1977 · 2024-03-14T13:16:49Z

2 fixes for 2.10 release: 1) DDP + GPU fix; 2) py3.8 fix for uniting two dicts

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

sven1977 · 2024-03-14T13:23:54Z

rllib/algorithms/algorithm.py

@@ -871,7 +871,7 @@ def step(self) -> ResultDict:
        results = self._compile_iteration_results(
            episodes_this_iter=episodes_this_iter,
            step_ctx=train_iter_ctx,
-            iteration_results=train_results | eval_results,
+            iteration_results={**train_results, **eval_results},


py3.8 does not support dict | dict.

Isn't py3.8 deprecated by the next Ray release version?

sven1977 · 2024-03-14T13:24:17Z

rllib/core/rl_module/torch/torch_rl_module.py

@@ -183,6 +183,13 @@ def _save_module_metadata(self, *args, **kwargs):
    def _module_metadata(self, *args, **kwargs):
        return self.unwrapped()._module_metadata(*args, **kwargs)

+    # TODO (sven): Figure out a better way to avoid having to method-spam this wrapper


We need to find a better solution for this, but this fixes PPO + multi-GPU on the new stack for now.

simonsays1980

LGTM

simonsays1980 · 2024-03-14T13:33:03Z

rllib/algorithms/algorithm.py

@@ -871,7 +871,7 @@ def step(self) -> ResultDict:
        results = self._compile_iteration_results(
            episodes_this_iter=episodes_this_iter,
            step_ctx=train_iter_ctx,
-            iteration_results=train_results | eval_results,
+            iteration_results={**train_results, **eval_results},


Isn't py3.8 deprecated by the next Ray release version?

simonsays1980 · 2024-03-14T13:41:41Z

rllib/core/rl_module/torch/torch_rl_module.py

+    #  class, whenever we add a new API to any wrapped RLModule here. We could try
+    #  auto generating the wrapper methods, but this will bring its own challenge
+    #  (e.g. recursive calls due to __getattr__ checks, etc..).
+    def _compute_values(self, *args, **kwargs):


Btw: Can't we also compile this method?

Cherry pick #44000 Signed-off-by: sven1977 <[email protected]>

wip

fed0798

Signed-off-by: sven1977 <[email protected]>

sven1977 requested review from avnishn, ArturNiederfahrenhorst, maxpumperla, kouroshHakha and simonsays1980 as code owners March 14, 2024 13:16

sven1977 commented Mar 14, 2024

View reviewed changes

simonsays1980 approved these changes Mar 14, 2024

View reviewed changes

sven1977 merged commit 8cefa97 into ray-project:master Mar 14, 2024
9 checks passed

sven1977 mentioned this pull request Mar 14, 2024

[RLlib] Release 2.10 fix for PPO on new API stack with multi-GPU. #44001

Merged

8 tasks

khluu pushed a commit that referenced this pull request Mar 14, 2024

[RLlib] Release 2.10 fix for PPO on new API stack with multi-GPU. #44001

b493018

Cherry pick #44000 Signed-off-by: sven1977 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] 2 fixes for 2.10 release: 1) DDP + GPU fix; 2) py3.8 fix for uniting two dicts #44000

[RLlib] 2 fixes for 2.10 release: 1) DDP + GPU fix; 2) py3.8 fix for uniting two dicts #44000

sven1977 commented Mar 14, 2024

sven1977 Mar 14, 2024

simonsays1980 Mar 14, 2024

sven1977 Mar 14, 2024

simonsays1980 left a comment

simonsays1980 Mar 14, 2024

simonsays1980 Mar 14, 2024

[RLlib] 2 fixes for 2.10 release: 1) DDP + GPU fix; 2) py3.8 fix for uniting two dicts #44000

[RLlib] 2 fixes for 2.10 release: 1) DDP + GPU fix; 2) py3.8 fix for uniting two dicts #44000

Conversation

sven1977 commented Mar 14, 2024

Why are these changes needed?

Related issue number

Checks

sven1977 Mar 14, 2024

Choose a reason for hiding this comment

simonsays1980 Mar 14, 2024

Choose a reason for hiding this comment

sven1977 Mar 14, 2024

Choose a reason for hiding this comment

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Mar 14, 2024

Choose a reason for hiding this comment

simonsays1980 Mar 14, 2024

Choose a reason for hiding this comment