[RLlib] Add 2D box example for PPO RL Modules #33840

ArturNiederfahrenhorst · 2023-03-29T01:32:00Z

Why are these changes needed?

Historically, RLlib has simply flattened 2D inputs.
This often makes it hard for algorithms to learn in such spaces.
The Catalog introduced with RLModules throws an error when facing such spaces instead of silently flattening them and tells users to transform observations into a 3D space.
This example shows how this can be done on a pettingzoo env.

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Signed-off-by: Avnish <[email protected]>

… enable-ppo-learner

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

…e minibatch, it was the entire batch. Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

… that was just added Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

… SingleAgentRLModule Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

…gorithm_config Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

ArturNiederfahrenhorst · 2023-03-29T05:52:21Z

Executing the example leads to the following learning curve:

https://tensorboard.dev/experiment/xlxnhV0jQBqK13iRJvRzTA/#scalars

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

kouroshHakha

Is this not the same as what we have in the unittest?

ArturNiederfahrenhorst · 2023-04-14T21:30:19Z

@kouroshHakha Yes, we have unittests that cover these inputs.
But we don't have an example that shows how, after RLModules, users should interface with environments that have a 2D obs space. The solution here is simply to wrap the environment and transform the observation space into something that RLlib likes. This is a simple solution that lets users write python code to solve their problem instead of implicitly transforming their inputs which has caused a lot of confusion in the past.

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Signed-off-by: Kourosh Hakhamaneshi <[email protected]> Signed-off-by: elliottower <[email protected]>

Signed-off-by: Kourosh Hakhamaneshi <[email protected]> Signed-off-by: Jack He <[email protected]>

kouroshHakha and others added 30 commits February 23, 2023 22:08

enabled

3be8248

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

lint

a3949d3

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

a9fc2dd

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

fxied the enabling

6118373

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Update way we enable the rl_module api by default

5bba827

Signed-off-by: Avnish <[email protected]>

Merge branch 'master' into enable-ppo-learner

afd4aef

Merge branch 'enable-ppo-learner' of github.com:kouroshHakha/ray into…

07c5035

… enable-ppo-learner

wip

c42a9d7

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

be4e2fe

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

06d2bba

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

125d1a5

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

c4c05ba

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Merge branch 'master' into self-play-learner

b623e13

wip

6e3a2c1

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

fixed the get_marl_module_spec and added a very comprehensive unittest

5968ac5

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

1476461

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Merge branch 'master' into self-play-learner

e1d092c

wip

bc036cd

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

29955f6

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip

7620227

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

revert the auto-enable

d446421

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

fixed a bug in tf_learner: The batch sent to the update_fn was not th…

ce862af

…e minibatch, it was the entire batch. Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

fixed the failed tests

1b6b793

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

fixed a bug in add_policy: we used to not set the state of the policy…

96221b1

… that was just added Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

update specs in get_marl_module_specs with default only if default is…

2c0ea17

… SingleAgentRLModule Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

fixed the update methodology at the end of get_marl_module_spec in al…

349f109

…gorithm_config Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

small lint

d6e6c8f

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

docstring

13ae6b9

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip in addressing feedback

9474b2d

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

wip addressing feedback

245abcc

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

kouroshHakha and others added 4 commits March 27, 2023 17:11

Merge branch 'master' into enable-ppo-learner

3a3d9dd

Merge branch 'master' into enable-ppo-learner

cff718b

Merge branch 'master' into enable-ppo-learner

8c4c8a3

Add box_2d_env.py

9fde200

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

ArturNiederfahrenhorst assigned kouroshHakha Mar 29, 2023

ArturNiederfahrenhorst requested review from sven1977, gjoliver, avnishn, smorad, maxpumperla, kouroshHakha and krfricke as code owners March 29, 2023 01:32

ArturNiederfahrenhorst added 2 commits March 28, 2023 23:13

Add example as test to BUILD

d114acc

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

Add comment with expected reward

9f68613

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

ArturNiederfahrenhorst changed the title ~~[RLlib] Add 2d box example~~ [RLlib] Add 2D box example for PPO RL Modules Mar 29, 2023

ArturNiederfahrenhorst added 7 commits March 29, 2023 09:22

Better docstrings

6738fae

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

better docstrings

5fae384

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

Better as-test arg

5f47b50

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

rename to greyscale env

c07013b

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

merge master

ea8c533

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

Merge branch 'master' into box2drlmodule

17aba7e

lynt

f8bcc0d

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

kouroshHakha reviewed Apr 14, 2023

View reviewed changes

kouroshHakha approved these changes Apr 14, 2023

View reviewed changes

gjoliver merged commit a333017 into ray-project:master Apr 14, 2023

vitsai pushed a commit to vitsai/ray that referenced this pull request Apr 17, 2023

[RLlib] Add 2D box example for PPO RL Modules (ray-project#33840)

a659e23

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023

[RLlib] Add 2D box example for PPO RL Modules (ray-project#33840)

451150d

Signed-off-by: Kourosh Hakhamaneshi <[email protected]> Signed-off-by: elliottower <[email protected]>

ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023

[RLlib] Add 2D box example for PPO RL Modules (ray-project#33840)

8a290b7

Signed-off-by: Kourosh Hakhamaneshi <[email protected]> Signed-off-by: Jack He <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Add 2D box example for PPO RL Modules #33840

[RLlib] Add 2D box example for PPO RL Modules #33840

ArturNiederfahrenhorst commented Mar 29, 2023

ArturNiederfahrenhorst commented Mar 29, 2023 •

edited

Loading

kouroshHakha left a comment

ArturNiederfahrenhorst commented Apr 14, 2023

[RLlib] Add 2D box example for PPO RL Modules #33840

[RLlib] Add 2D box example for PPO RL Modules #33840

Conversation

ArturNiederfahrenhorst commented Mar 29, 2023

Why are these changes needed?

ArturNiederfahrenhorst commented Mar 29, 2023 • edited Loading

kouroshHakha left a comment

Choose a reason for hiding this comment

ArturNiederfahrenhorst commented Apr 14, 2023

ArturNiederfahrenhorst commented Mar 29, 2023 •

edited

Loading