[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Specs, SpecDict, TensorSpec). #47915

sven1977 · 2024-10-07T11:34:07Z

New API stack: (Multi)RLModule overhaul vol 05 (deprecate Specs, SpecDict, TensorSpec).
Arguments for this move are:

Simplicity! The fewer classes users have to learn (or just think(!) they have to learn) to get started with RLlib, the better.
Only useful for default RLlib modules/model components (users know much better what their models and model-components expect and should not have to worry about learning to correctly tag their components with specs or override a dozen or so methods to achieve this kind of tagging: get_input_specs, get_output_specs, input_specs_inference, output_specs_inference, input_specs_exploration, output_specs_exploration, input_specs_train, output_specs_train).
Those default RLlib models can all be reduced to very simple input/output structures: [B, ...] for non-RNN model components and [B, T, ...] for RNN-based model components. We don't need specs for this.
On the RLModule side:
** All _forward_inference AND _forward_exploration should either return ACTION_DIST_INPUTS and/or ACTIONS keys. The behavior for RLlib in terms of action sampling in the EnvRunners is already well described in the docs.
** _forward_train must always be properly aligned with the algo's Learner's loss function, so this one remains tricky, regardless of having specs or not. The long-term solution should be what we have already started with the RLModule APIs: The algo's Learner determines, which APIs are hard requirements for any RLModule (custom or RLlib-default) to bring to the table in order to be learnt by the algo, for example: PPO requires all RLModules to implement the ValueFunctionAPI. We should expand this algo-agnostic and spec independent pattern to all the other algos as well (DQN/SAC/DreamerV3, etc..).

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

…odule_do_over_bc_default_module_05_deprecate_specs

Signed-off-by: sven1977 <[email protected]>

simonsays1980

LGTM. A lot of code will be gone which means: less maintenance costs - this is good! I have some worries in regard to error identifiability. We might want to test this.

simonsays1980 · 2024-10-07T12:49:44Z

rllib/algorithms/dreamerv3/dreamerv3_rl_module.py

-
-    @override(RLModule)
-    def output_specs_train(self) -> SpecDict:
-        return [


Can we leave these as a kind of comment/docstring in the forward method? Maybe we do this for all default algos such that we, new RLlib team members and users always know what is returned (received).

simonsays1980 · 2024-10-07T12:51:41Z

rllib/core/models/torch/heads.py

@@ -203,15 +116,6 @@ def __init__(self, config: FreeLogStdMLPHeadConfig) -> None:
        self.register_buffer("log_std_clip_param_const", self.log_std_clip_param)

    @override(Model)
-    def get_input_specs(self) -> Optional[Spec]:


Let us check with tests, if missing a needed input or output is well traceable and the error can be well understood.

…odule_do_over_bc_default_module_05_deprecate_specs

Signed-off-by: sven1977 <[email protected]>

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

sven1977 added 4 commits October 6, 2024 23:04

wip

4d7d553

Signed-off-by: sven1977 <[email protected]>

wip

fc4934a

Signed-off-by: sven1977 <[email protected]>

wip

24819e2

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into rl_m…

a3b9529

…odule_do_over_bc_default_module_05_deprecate_specs

sven1977 requested review from ArturNiederfahrenhorst, maxpumperla, simonsays1980 and a team as code owners October 7, 2024 11:34

sven1977 assigned simonsays1980 Oct 7, 2024

fixes

3569e3f

Signed-off-by: sven1977 <[email protected]>

simonsays1980 approved these changes Oct 7, 2024

View reviewed changes

sven1977 added 2 commits October 9, 2024 10:26

Merge branch 'master' of https://github.com/ray-project/ray into rl_m…

afb6985

…odule_do_over_bc_default_module_05_deprecate_specs

wip

25baf37

Signed-off-by: sven1977 <[email protected]>

sven1977 enabled auto-merge (squash) October 9, 2024 08:27

github-actions bot added the go add ONLY when ready to merge, run all tests label Oct 9, 2024

fixes

1c6ad96

Signed-off-by: sven1977 <[email protected]>

github-actions bot disabled auto-merge October 9, 2024 09:55

sven1977 enabled auto-merge (squash) October 9, 2024 10:41

sven1977 merged commit 616eef8 into ray-project:master Oct 9, 2024
6 checks passed

sven1977 deleted the rl_module_do_over_bc_default_module_05_deprecate_specs branch October 9, 2024 12:09

sven1977 added tests-ok The tagger certifies test failures are unrelated and assumes personal liability. rllib RLlib related issues rllib-models An issue related to RLlib (default or custom) Models. rllib-newstack labels Oct 9, 2024

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

b83f01b

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

1fa0016

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

ce265dd

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

74822e0

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

8920d00

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

f8ba193

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

ab2d719

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

b2a8acf

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Spe…

c4d884b

…cs, SpecDict, TensorSpec). (ray-project#47915) Signed-off-by: ujjawal-khare <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Specs, SpecDict, TensorSpec). #47915

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Specs, SpecDict, TensorSpec). #47915

sven1977 commented Oct 7, 2024 •

edited

Loading

simonsays1980 left a comment

simonsays1980 Oct 7, 2024

simonsays1980 Oct 7, 2024

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Specs, SpecDict, TensorSpec). #47915

[RLlib] New API stack: (Multi)RLModule overhaul vol 05 (deprecate Specs, SpecDict, TensorSpec). #47915

Conversation

sven1977 commented Oct 7, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Oct 7, 2024

Choose a reason for hiding this comment

simonsays1980 Oct 7, 2024

Choose a reason for hiding this comment

sven1977 commented Oct 7, 2024 •

edited

Loading