[RLlib] Cleanup `examples` folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack. #44706

sven1977 · 2024-04-12T14:20:42Z

Cleanup examples folder 04:

Curriculum example moved to new API stack.
checkpoint-by-custom-criteria example moved to new API stack.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

…nup_examples_folder_04

Signed-off-by: sven1977 <[email protected]>

…nup_examples_folder_04

Signed-off-by: sven1977 <[email protected]>

simonsays1980

LGTM. Very happy about the curriculum example.

simonsays1980 · 2024-04-12T15:19:45Z

rllib/examples/checkpoints/checkpoint_by_custom_criteria.py

+
+For debugging, use the following additional command line options
+`--no-tune --num-env-runners=0`
+which should allow you to set breakpoints anywhere in the RLlib code and


Works also with tune, but --local-mode :)

Absolutely! I'm always afraid, we are going to get rid of Ray local-mode at some point. Also, for any number of Learner workers > 0, local mode doesn't work (not sure why, actually).

simonsays1980 · 2024-04-12T15:28:55Z

rllib/examples/checkpoints/checkpoint_by_custom_criteria.py

-    ckpt = results.get_best_result(metric=policy_loss_key, mode="min").checkpoint
-    print("Lowest pol-loss: {}".format(ckpt))
+    best_result = results.get_best_result(metric=policy_loss_key, mode="min")
+    ckpt = best_result.checkpoint


We could also ask here for the best checkpoint along the training path best_result.get_best_checkpoint(metric=policy_loss_key, mode="min")

Ah, cool, so ckpt = best_result.checkpoint returns the very last checkpoint only?

And if the last is not the best one, it's better to do:
best_result.get_best_checkpoint(metric=policy_loss_key, mode="min")
??

This actually doesn't seem to work well with nested keys.
If I do best_result.get_best_checkpoint(policy_loss_key, mode="min"), I get:

RuntimeError: Invalid metric name ('info', 'learner', 'default_policy', 'learner_stats', 'policy_loss')! You may choose from the following metrics: dict_keys(['custom_metrics', 'episode_media', 'info', 'sampler_results', 'episode_reward_max', 'episode_reward_min', 'episode_reward_mean', 'episode_len_mean', 'episodes_this_iter', 'episodes_timesteps_total', 'policy_reward_min', 'policy_reward_max', 'policy_reward_mean', 'hist_stats', 'sampler_perf', 'num_faulty_episodes', 'connector_metrics', 'num_healthy_workers', 'num_in_flight_async_reqs', 'num_remote_worker_restarts', 'num_agent_steps_sampled', 'num_agent_steps_trained', 'num_env_steps_sampled', 'num_env_steps_trained', 'num_env_steps_sampled_this_iter', 'num_env_steps_trained_this_iter', 'num_env_steps_sampled_throughput_per_sec', 'num_env_steps_trained_throughput_per_sec', 'timesteps_total', 'num_steps_trained_this_iter', 'agent_timesteps_total', 'timers', 'counters', 'done', 'episodes_total', 'training_iteration', 'trial_id', 'date', 'timestamp', 'time_this_iter_s', 'time_total_s', 'pid', 'hostname', 'node_ip', 'config', 'time_since_restore', 'iterations_since_restore', 'perf', 'experiment_tag']).

simonsays1980 · 2024-04-12T15:29:04Z

rllib/examples/checkpoints/checkpoint_by_custom_criteria.py

-
-    ray.shutdown()
+    best_result = results.get_best_result(metric=vf_loss_key, mode="max")
+    ckpt = best_result.checkpoint


Here as well

simonsays1980 · 2024-04-12T16:09:14Z

rllib/examples/curriculum/curriculum_learning.py

-        param_space=config.to_dict(),
-        run_config=air.RunConfig(stop=stop, verbose=2),
+    run_rllib_example_script_experiment(
+        base_config, args, stop=stop, success_metric={"task_solved": 1.0}


Very nice example 👍

…nup_examples_folder_04

Signed-off-by: sven1977 <[email protected]>

sven1977 added 5 commits April 10, 2024 22:42

wip

7728726

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

68dc2f6

…nup_examples_folder_04

wip

ac1ba10

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

e1d1058

…nup_examples_folder_04

wip

44e436b

Signed-off-by: sven1977 <[email protected]>

sven1977 requested review from avnishn, ArturNiederfahrenhorst, maxpumperla, kouroshHakha and simonsays1980 as code owners April 12, 2024 14:20

sven1977 assigned simonsays1980 Apr 12, 2024

sven1977 added rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples rllib-newstack rllib-oldstack-cleanup Issues related to cleaning up classes, utilities on the old API stack labels Apr 12, 2024

simonsays1980 approved these changes Apr 12, 2024

View reviewed changes

sven1977 added 3 commits April 13, 2024 19:02

Merge branch 'master' of https://github.com/ray-project/ray into clea…

fc19603

…nup_examples_folder_04

wip

2645a8b

Signed-off-by: sven1977 <[email protected]>

fix

3bdac7f

Signed-off-by: sven1977 <[email protected]>

sven1977 merged commit f1f0ced into ray-project:master Apr 14, 2024
5 checks passed

sven1977 deleted the cleanup_examples_folder_04 branch April 14, 2024 10:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Cleanup `examples` folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack. #44706

[RLlib] Cleanup `examples` folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack. #44706

sven1977 commented Apr 12, 2024 •

edited

Loading

simonsays1980 left a comment

simonsays1980 Apr 12, 2024

sven1977 Apr 13, 2024

simonsays1980 Apr 12, 2024

sven1977 Apr 13, 2024

sven1977 Apr 14, 2024

simonsays1980 Apr 12, 2024

simonsays1980 Apr 12, 2024

[RLlib] Cleanup examples folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack. #44706

[RLlib] Cleanup examples folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack. #44706

Conversation

sven1977 commented Apr 12, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Apr 12, 2024

Choose a reason for hiding this comment

sven1977 Apr 13, 2024

Choose a reason for hiding this comment

simonsays1980 Apr 12, 2024

Choose a reason for hiding this comment

sven1977 Apr 13, 2024

Choose a reason for hiding this comment

sven1977 Apr 14, 2024

Choose a reason for hiding this comment

simonsays1980 Apr 12, 2024

Choose a reason for hiding this comment

simonsays1980 Apr 12, 2024

Choose a reason for hiding this comment

[RLlib] Cleanup `examples` folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack. #44706

[RLlib] Cleanup `examples` folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack. #44706

sven1977 commented Apr 12, 2024 •

edited

Loading