[RLlib] - Fix APPO RLModule inference-only problems. #45111

simonsays1980 · 2024-05-02T12:08:31Z

Why are these changes needed?

The APPO algorithm could not deal with inference-only, yet. This change does enable inference-only modules for APPO. In addition to the _inference_only_state_dict_keys from PPO, APPO needs to remove the target networks when in inference.

For this to work properly a parent RLModule is needed that triggers the building of the _inference_only_state_dict_keys. Due to inheritance from the PPO module polymorphism did not work out of the box b/c this parent was missing. This parent is added to APPO in this PR as well, named APPORLModule.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Simon Zehnder <[email protected]>

rllib/algorithms/appo/torch/appo_torch_rl_module.py

…rence_only_state_dict_keys'. Modified 'PPOTorchRLModuel' to not register already keys from APPO. Signed-off-by: Simon Zehnder <[email protected]>

…instead. Signed-off-by: Simon Zehnder <[email protected]>

sven1977 · 2024-05-03T09:51:07Z

rllib/algorithms/appo/tf/appo_tf_rl_module.py

@@ -15,19 +16,7 @@
 _, tf, _ = try_import_tf()


-class APPOTfRLModule(PPOTfRLModule, RLModuleWithTargetNetworksInterface):
-    def setup(self):


sven1977 · 2024-05-03T09:51:11Z

rllib/algorithms/appo/torch/appo_torch_rl_module.py

@@ -14,20 +15,9 @@
 from ray.rllib.utils.nested_dict import NestedDict


-class APPOTorchRLModule(PPOTorchRLModule, RLModuleWithTargetNetworksInterface):
-    @override(PPOTorchRLModule)


sven1977 · 2024-05-03T09:52:00Z

rllib/algorithms/appo/torch/appo_torch_rl_module.py

@@ -42,8 +32,26 @@ def output_specs_train(self) -> List[str]:

    @override(PPOTorchRLModule)
    def _forward_train(self, batch: NestedDict):
+        if self.inference_only:


nit: Maybe we should move this error into the parent RLModule class' main method?
def forward_train(self, ...):?

sven1977

LGTM! Just one nit/question about moving an error further up the inheritance chain to have all future sub-classes benefit from this logic.
Thanks @simonsays1980 ! :)

…les. Signed-off-by: Simon Zehnder <[email protected]>

…ent already does it. Signed-off-by: Simon Zehnder <[email protected]>

Made APPO RLModule always a learner module.

a76863e

Signed-off-by: Simon Zehnder <[email protected]>

simonsays1980 added rllib RLlib related issues rllib-newstack labels May 2, 2024

simonsays1980 self-assigned this May 2, 2024

sven1977 reviewed May 2, 2024

View reviewed changes

rllib/algorithms/appo/torch/appo_torch_rl_module.py Outdated Show resolved Hide resolved

Added a base module for APPO to enable straight inheritance of '_infe…

c2b5db9

…rence_only_state_dict_keys'. Modified 'PPOTorchRLModuel' to not register already keys from APPO. Signed-off-by: Simon Zehnder <[email protected]>

simonsays1980 marked this pull request as ready for review May 2, 2024 14:38

simonsays1980 requested review from avnishn, ArturNiederfahrenhorst, maxpumperla and kouroshHakha as code owners May 2, 2024 14:38

simonsays1980 changed the title ~~Made APPO RLModule always a learner module.~~ [RLlib] - Fix APPO RLModule inference-only problems. May 2, 2024

anyscalesam requested a review from sven1977 May 2, 2024 19:42

Removed setup from 'APPTfRLModule' and inherited from 'APPORLModule' …

52e0a27

…instead. Signed-off-by: Simon Zehnder <[email protected]>

sven1977 assigned sven1977 and unassigned simonsays1980 May 3, 2024

sven1977 reviewed May 3, 2024

View reviewed changes

sven1977 approved these changes May 3, 2024

View reviewed changes

simonsays1980 added 2 commits May 3, 2024 11:53

Removed weight updates from 'APPORLModule' to framework-specific modu…

95d1934

…les. Signed-off-by: Simon Zehnder <[email protected]>

Removed checking for 'inference-only' from 'forward_train' as the par…

a650368

…ent already does it. Signed-off-by: Simon Zehnder <[email protected]>

sven1977 merged commit 45d5640 into ray-project:master May 3, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] - Fix APPO RLModule inference-only problems. #45111

[RLlib] - Fix APPO RLModule inference-only problems. #45111

simonsays1980 commented May 2, 2024 •

edited

Loading

sven1977 May 3, 2024

sven1977 May 3, 2024

sven1977 May 3, 2024

sven1977 left a comment

[RLlib] - Fix APPO RLModule inference-only problems. #45111

[RLlib] - Fix APPO RLModule inference-only problems. #45111

Conversation

simonsays1980 commented May 2, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

sven1977 May 3, 2024

Choose a reason for hiding this comment

sven1977 May 3, 2024

Choose a reason for hiding this comment

sven1977 May 3, 2024

Choose a reason for hiding this comment

sven1977 left a comment

Choose a reason for hiding this comment

simonsays1980 commented May 2, 2024 •

edited

Loading