[RLlib] AlgorithmConfigs: Broad rollout; Example scripts #29700

sven1977 · 2022-10-26T15:16:28Z

This PR introduces:

AlgorithmConfig objects being returned by all built-in RLlib Algorithm.get_default_config() methods.
Returning a dict here is still supported and covered by a new backward-compat test case.
Adds test cases for different AlgorithmConfig setups and translations.
Makes sure a specific algorithm (e.g. PPO) can even be built properly with a generic superclass AlgorithmConfig object (if no PPO-specific settings need to be changed).
Starts converting example scripts from old config dicts to using AlgorithmConfig objects.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

# Conflicts: # rllib/policy/policy.py

Signed-off-by: sven1977 <[email protected]>

…_configs_next_steps_1

Signed-off-by: sven1977 <[email protected]>

…_configs_next_steps_1

Signed-off-by: sven1977 <[email protected]>

ArturNiederfahrenhorst · 2022-10-26T17:03:54Z

rllib/algorithms/algorithm.py

@@ -322,8 +322,10 @@ def __init__(
            **kwargs: Arguments passed to the Trainable base class.
        """

-        # Resolve possible dict into an AlgorithmConfig object.
-        # TODO: In the future, only support AlgorithmConfig objects here.
+        # Resolve possible dict into an AlgorithmConfig object as well as


According to the type descriptors in the function signature, we don't accept dicts here anymore!

(If we still do, we should send a deprecation warning?)

True! Good point.

I still want to support it for a while, but yes, we should warn.

Signed-off-by: sven1977 <[email protected]>

rllib/algorithms/tests/test_algorithm_config.py

maxpumperla · 2022-10-27T06:52:47Z

rllib/examples/autoregressive_action_dist.py

-        algo = ppo.PPO(config=ppo_config, env=CorrelatedActionsEnv)
+        # Have to specify this here are we are working with a generic AlgorithmConfig
+        # object, not a specific one (e.g. PPOConfig).
+        config.algo_class = args.run


maxpumperla · 2022-10-27T06:56:08Z

rllib/examples/custom_logger.py

        # Run with tracing enabled for tfe/tf2.
-        "eager_tracing": args.framework in ["tfe", "tf2"],
+        .framework(args.framework, eager_tracing=args.framework in ["tfe", "tf2"])


maybe we can start removing tfe from examples?

Done in a different PR:
#29755

maxpumperla · 2022-10-27T06:57:44Z

rllib/examples/fractional_gpus.py

-        # Set this to > 1 for multi-GPU learning.
-        "num_gpus": args.num_gpus,
+        .environment(
+            GPURequiringEnv if args.num_gpus_per_worker > 0.0 else "CartPole-v0"


can we upgrade to v1? if I'm not mistaken, this doesn't exist anymore in recent gym releases`

Done in a separate PR. Feel like this shouldn't be in here. CartPole-v1 might indeed behave slightly differently, so we go t to be careful not to break any tuned examples.

I think it just has another reward scale so we need to adjust test that depend on it.

maxpumperla

looks amazing! (couple of optional ideas/questions to consider)

Signed-off-by: sven1977 <[email protected]>

…_configs_next_steps_2 Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/algorithms/algorithm.py

Signed-off-by: sven1977 <[email protected]>

…_configs_next_steps_2 Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/examples/action_masking.py # rllib/examples/checkpoint_by_custom_criteria.py # rllib/examples/custom_logger.py # rllib/examples/inference_and_serving/policy_inference_after_training.py # rllib/examples/inference_and_serving/policy_inference_after_training_with_attention.py # rllib/examples/vizdoom_with_attention_net.py # rllib/tests/test_supported_spaces.py

Signed-off-by: sven1977 <[email protected]>

ArturNiederfahrenhorst · 2022-10-27T17:15:49Z

rllib/algorithms/algorithm_config.py

+        if isinstance(self.algo_class, str):
+            algo_class = get_algorithm_class(self.algo_class)
+
+        return algo_class(


I feel like it would be a good place to always create a deepcopy and freezing it, right?

ArturNiederfahrenhorst · 2022-10-27T17:54:45Z

rllib/examples/bandit/tune_lin_ucb_train_recommendation.py

+            evaluation_duration_unit="episodes",
+        )
+    )
+    config.simple_optimizer = True


Why do we need to set this?

ArturNiederfahrenhorst · 2022-10-28T00:09:52Z

rllib/examples/eager_execution.py

-        "model": {"custom_model": "eager_model"},
-        "framework": "tf2",
-    }
+        .resources(num_gpus=int(os.environ.get("RLLIB_NUM_GPUS", "0")))


I think, since we have config objects now, we should default to num_gpus=None and check for RLLIB_NUM_GPUS when freezing the config object / set num_gpus=0 if num_gpus=None. This will make this not very pretty and super redudant line unnecessary.

…_configs_next_steps_2

Signed-off-by: sven1977 <[email protected]>

ArturNiederfahrenhorst

lgtm! what a huge PR - and I found nothing that could justify a request of changes! I can approve again when tests are green :)

Signed-off-by: sven1977 <[email protected]>

…t#29700) Signed-off-by: Weichen Xu <[email protected]>

sven1977 added 30 commits September 28, 2022 21:57

wip

250a88a

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' into algo_configs_next_steps_1

b268d14

# Conflicts: # rllib/policy/policy.py

wip

dc8ea88

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into algo…

1edaf5f

…_configs_next_steps_1

wip

9b619da

Signed-off-by: sven1977 <[email protected]>

wip

c810b41

Signed-off-by: sven1977 <[email protected]>

wip

e6ee47e

Signed-off-by: sven1977 <[email protected]>

wip

4ac1944

Signed-off-by: sven1977 <[email protected]>

wip

d777f7a

Signed-off-by: sven1977 <[email protected]>

wip

ad2aafc

Signed-off-by: sven1977 <[email protected]>

wip

60b8b41

Signed-off-by: sven1977 <[email protected]>

wip

301ef3d

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into algo…

e735dbb

…_configs_next_steps_1

wip

7928682

Signed-off-by: sven1977 <[email protected]>

wip

306106b

Signed-off-by: sven1977 <[email protected]>

wip

884d9fb

Signed-off-by: sven1977 <[email protected]>

fix

4ee1959

Signed-off-by: sven1977 <[email protected]>

wip

7106a6d

Signed-off-by: sven1977 <[email protected]>

wip

a538604

Signed-off-by: sven1977 <[email protected]>

wip

36d93c4

Signed-off-by: sven1977 <[email protected]>

wip

ccb8569

Signed-off-by: sven1977 <[email protected]>

wip

8c86ef4

Signed-off-by: sven1977 <[email protected]>

wip

9246fb1

Signed-off-by: sven1977 <[email protected]>

wip

56865ee

Signed-off-by: sven1977 <[email protected]>

wip

3054fac

Signed-off-by: sven1977 <[email protected]>

wip

c0b2941

Signed-off-by: sven1977 <[email protected]>

wip

37822d7

Signed-off-by: sven1977 <[email protected]>

wip

54c65de

Signed-off-by: sven1977 <[email protected]>

wip

6e60562

Signed-off-by: sven1977 <[email protected]>

wip

06fc7b4

Signed-off-by: sven1977 <[email protected]>

wip

5f7b284

Signed-off-by: sven1977 <[email protected]>

ArturNiederfahrenhorst reviewed Oct 26, 2022

View reviewed changes

wip

b56f2fe

Signed-off-by: sven1977 <[email protected]>

ArturNiederfahrenhorst reviewed Oct 26, 2022

View reviewed changes

rllib/algorithms/tests/test_algorithm_config.py Show resolved Hide resolved

maxpumperla reviewed Oct 27, 2022

View reviewed changes

maxpumperla approved these changes Oct 27, 2022

View reviewed changes

sven1977 added 11 commits October 27, 2022 12:39

wip

6ec14e7

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into algo…

045da56

…_configs_next_steps_2 Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/algorithms/algorithm.py

wip

6549f05

Signed-off-by: sven1977 <[email protected]>

wip

fb3f3e0

Signed-off-by: sven1977 <[email protected]>

wip

dc82eb3

Signed-off-by: sven1977 <[email protected]>

LINT

9f49cd3

Signed-off-by: sven1977 <[email protected]>

LINT

6db2903

Signed-off-by: sven1977 <[email protected]>

wip

ba54706

Signed-off-by: sven1977 <[email protected]>

wip

132b2a8

Signed-off-by: sven1977 <[email protected]>

wip

2742867

Signed-off-by: sven1977 <[email protected]>

ArturNiederfahrenhorst reviewed Oct 27, 2022

View reviewed changes

ArturNiederfahrenhorst reviewed Oct 28, 2022

View reviewed changes

sven1977 added 2 commits October 28, 2022 09:35

Merge branch 'master' of https://github.com/ray-project/ray into algo…

7a3385d

…_configs_next_steps_2

wip

aac47b5

Signed-off-by: sven1977 <[email protected]>

ArturNiederfahrenhorst approved these changes Oct 28, 2022

View reviewed changes

sven1977 added 2 commits October 28, 2022 11:10

wip

ced46e4

Signed-off-by: sven1977 <[email protected]>

wip

2402f53

Signed-off-by: sven1977 <[email protected]>

sven1977 merged commit 5af66e6 into ray-project:master Oct 28, 2022

WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022

[RLlib] AlgorithmConfigs: Broad rollout; Example scripts. (ray-projec…

f24cb62

…t#29700) Signed-off-by: Weichen Xu <[email protected]>

sven1977 deleted the algo_configs_next_steps_2 branch June 2, 2023 20:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] AlgorithmConfigs: Broad rollout; Example scripts #29700

[RLlib] AlgorithmConfigs: Broad rollout; Example scripts #29700

sven1977 commented Oct 26, 2022 •

edited

Loading

ArturNiederfahrenhorst Oct 26, 2022

ArturNiederfahrenhorst Oct 26, 2022

sven1977 Oct 27, 2022

sven1977 Oct 27, 2022

maxpumperla Oct 27, 2022

maxpumperla Oct 27, 2022

sven1977 Oct 27, 2022

maxpumperla Oct 27, 2022

sven1977 Oct 27, 2022

ArturNiederfahrenhorst Oct 28, 2022

maxpumperla left a comment

ArturNiederfahrenhorst Oct 27, 2022

ArturNiederfahrenhorst Oct 27, 2022 •

edited

Loading

ArturNiederfahrenhorst Oct 28, 2022 •

edited

Loading

ArturNiederfahrenhorst left a comment •

edited

Loading

[RLlib] AlgorithmConfigs: Broad rollout; Example scripts #29700

[RLlib] AlgorithmConfigs: Broad rollout; Example scripts #29700

Conversation

sven1977 commented Oct 26, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maxpumperla left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArturNiederfahrenhorst Oct 27, 2022 • edited Loading

Choose a reason for hiding this comment

ArturNiederfahrenhorst Oct 28, 2022 • edited Loading

Choose a reason for hiding this comment

ArturNiederfahrenhorst left a comment • edited Loading

Choose a reason for hiding this comment

sven1977 commented Oct 26, 2022 •

edited

Loading

ArturNiederfahrenhorst Oct 27, 2022 •

edited

Loading

ArturNiederfahrenhorst Oct 28, 2022 •

edited

Loading

ArturNiederfahrenhorst left a comment •

edited

Loading