[RLlib] Cleanup examples folder 18: Add example script for offline RL (BC) training on single-agent, while evaluating w/ multi-agent setup. #46251

sven1977 · 2024-06-25T14:55:46Z

Cleanup examples folder 18: Add example script for offline RL (BC) training on single-agent, while evaluating w/ multi-agent setup.

Some minor bug fixes were necessary to make this new example run.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

…nup_examples_folder_18_bc_sa_training_then_ma_evaluation

simonsays1980

LGTM. Some comments.

simonsays1980 · 2024-06-25T16:21:39Z

rllib/algorithms/bc/bc.py

-                train_batch = train_batch.as_multi_agent()
-                self._counters[NUM_AGENT_STEPS_SAMPLED] += train_batch.agent_steps()
-                self._counters[NUM_ENV_STEPS_SAMPLED] += train_batch.env_steps()
+                # TODO (sven): Use metrics API as soon as we moved to new API stack


Is this one not using the MetricsLogger, yet? I use it in my overhaul of offline RL

Yes, you should. All good there. :)

But the hybrid API stack still goes through the summarize_episodes utility inside algorithm.py, which operates on the old RolloutMetrics objects returned by RolloutWorkers.

simonsays1980 · 2024-06-25T16:23:36Z

rllib/examples/offline_rl/pretrain_bc_single_agent_evaluate_as_multi_agent.py

+Here, SA=single-agent and MA=multi-agent.
+
+Note that the BC Algorithm - by default - runs on the hybrid API stack, using RLModules,
+but not EnvRunners or ConnectorV2s yet.


Probably adding here that is specifically doe snot use SingleAgentEpisode/MultiAgentEpisode?

simonsays1980 · 2024-06-25T16:24:50Z

rllib/examples/offline_rl/pretrain_bc_single_agent_evaluate_as_multi_agent.py

+    base_config = (
+        BCConfig()
+        .environment(
+            observation_space=dummy_env.observation_space,


Can we give a quick note, why in this case the user needs to provide the spaces?

Signed-off-by: sven1977 <[email protected]>

…nup_examples_folder_18_bc_sa_training_then_ma_evaluation

Signed-off-by: sven1977 <[email protected]>

…nup_examples_folder_18_bc_sa_training_then_ma_evaluation

Signed-off-by: sven1977 <[email protected]>

…nup_examples_folder_18_bc_sa_training_then_ma_evaluation

sven1977 added 2 commits June 25, 2024 16:54

wip

ff392a1

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

c80b094

…nup_examples_folder_18_bc_sa_training_then_ma_evaluation

sven1977 requested review from ArturNiederfahrenhorst and simonsays1980 as code owners June 25, 2024 14:55

sven1977 assigned simonsays1980 Jun 25, 2024

simonsays1980 approved these changes Jun 25, 2024

View reviewed changes

sven1977 added 3 commits June 25, 2024 20:35

wip

cb41ddb

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

f12caa0

…nup_examples_folder_18_bc_sa_training_then_ma_evaluation

fix

d40e202

Signed-off-by: sven1977 <[email protected]>

sven1977 enabled auto-merge (squash) June 26, 2024 08:20

github-actions bot added the go add ONLY when ready to merge, run all tests label Jun 26, 2024

wip

aaab6da

Signed-off-by: sven1977 <[email protected]>

github-actions bot disabled auto-merge June 26, 2024 08:24

sven1977 added 5 commits June 26, 2024 13:20

fix

433cd05

Signed-off-by: sven1977 <[email protected]>

fix

f8079a4

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

0b2ed4d

…nup_examples_folder_18_bc_sa_training_then_ma_evaluation

wip

05edf83

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

c7dd3e2

…nup_examples_folder_18_bc_sa_training_then_ma_evaluation

sven1977 merged commit b257b49 into ray-project:master Jun 26, 2024
6 checks passed

sven1977 deleted the cleanup_examples_folder_18_bc_sa_training_then_ma_evaluation branch June 27, 2024 08:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Cleanup examples folder 18: Add example script for offline RL (BC) training on single-agent, while evaluating w/ multi-agent setup. #46251

[RLlib] Cleanup examples folder 18: Add example script for offline RL (BC) training on single-agent, while evaluating w/ multi-agent setup. #46251

sven1977 commented Jun 25, 2024 •

edited

Loading

simonsays1980 left a comment

simonsays1980 Jun 25, 2024

sven1977 Jun 26, 2024

simonsays1980 Jun 25, 2024

simonsays1980 Jun 25, 2024

[RLlib] Cleanup examples folder 18: Add example script for offline RL (BC) training on single-agent, while evaluating w/ multi-agent setup. #46251

[RLlib] Cleanup examples folder 18: Add example script for offline RL (BC) training on single-agent, while evaluating w/ multi-agent setup. #46251

Conversation

sven1977 commented Jun 25, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Jun 25, 2024

Choose a reason for hiding this comment

sven1977 Jun 26, 2024

Choose a reason for hiding this comment

simonsays1980 Jun 25, 2024

Choose a reason for hiding this comment

simonsays1980 Jun 25, 2024

Choose a reason for hiding this comment

sven1977 commented Jun 25, 2024 •

edited

Loading