[RLlib] Example script `custom_metrics_and_callbacks.py` should work for `batch_mode=complete_episodes`. #22684

simonsays1980 · 2022-02-27T12:32:33Z

Why are these changes needed?

To enable a better traceability and to give users a universal guideline that works with different batch_modes when producing custom metrics.
So far the batch_mode="complete_episodes" produces an error in the custom_metrics_and_callbacks.py. The reason for this is:

When batch_mode="complete_episodes" the SimpleListCollector will be called to build a MultiAgentBatch
That in turn will call the build() method of the PolicyCollector and empty the batches attribute.
As the batches from the PolicyCollector are empty now, episode.batch_builder.policy_collectors["default_policy"].batches[-1] does not exist in the assert expression.

For most of the end-users running their algorithms with batch_mode="complete_episodes" trying to implement their own metrics by following this example will possibly not able to trace this error back and to find a solution for their own code.

Related issue number

#22683

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…ch_mode':'complete_episodes'.

…ed during debugging.

gjoliver

can you please rebase master into your branch, a lot of strange numpy random number generator errors.
thanks.

gjoliver · 2022-02-27T17:43:06Z

rllib/examples/custom_metrics_and_callbacks.py

-        ][-1], (
-            "ERROR: `on_episode_end()` should only be called " "after episode is done!"
-        )
+        # Check if there are multiple episodes in a batch, i.e.


thanks a ton for noticing this.
any suggestion on how can we assert this for "complete_episode" mode as well?

Trainer class

I think in case of complete_episodes we do not have to care about, if episodes are really done because they are always done (complete). But to have a callback that can run with both batch_modes we need a prior safeguard for this. In case of truncate_episodes we have to be careful if the episode is indeed done.

Custom usage

This is different, if a user implements the callback in his own code and calls this callback before an episode is done. I don't know if (1) this case is rare and would happen only between a rollout and a batch completion or (2) if it can happen literally at any point of sampling. If we want to ensure that the below assertion does not error out, we might have a look at the batches attribute of the PolicyCollector as this gets emptied when the MultiAgentBatch is created by the simple_list_collector. I have also not yet figured out, if this attribute (batches) is only emptied, when the episode is done.

Actually, even in "complete_episodes" mode, there could be more than one episodes in the final train batch (but not here in this callback).

We should probably rename multiple_episodes_in_batch into collect_complete_episodes (bool).

I think we should rather use worker.policy_config["batch_mode"] here in this if block.

Hey @simonsays1980 , thanks for this fix! Let me know, if you want to do a follow-up PR with the 2 above suggested changes. I think this would help clarify these batch generating rules for the different settings even more.

Hi @sven1977, yes I can do a follow-up PR. Should I do this within the same branch or better create a new one?

…omplete_episodes

simonsays1980 · 2022-02-27T22:54:19Z

@gjoliver I rebased. Can we rerun the tests somehow?

gjoliver

ok, the tests look reasonable.
@sven1977, can you help merge? :)

rllib/examples/custom_metrics_and_callbacks.py

…sven1977

…sven1977 in PR ray-project#22684.

simonsays1980 added 2 commits February 27, 2022 12:53

Added a safeguard to check for multiple episodes in a batch when 'bat…

9e49857

…ch_mode':'complete_episodes'.

Removed the 'local_mode=True' parameter from 'ray.init()' that was us…

9d0f2eb

…ed during debugging.

simonsays1980 requested review from sven1977, gjoliver and avnishn as code owners February 27, 2022 12:32

gjoliver reviewed Feb 27, 2022

View reviewed changes

simonsays1980 force-pushed the custom-metrics-and-callbacks-complete_episodes branch from a820140 to 9d0f2eb Compare February 27, 2022 22:51

Merge branch 'ray-project:master' into custom-metrics-and-callbacks-c…

ea1e07b

…omplete_episodes

gjoliver approved these changes Feb 28, 2022

View reviewed changes

sven1977 reviewed Mar 1, 2022

View reviewed changes

rllib/examples/custom_metrics_and_callbacks.py Outdated Show resolved Hide resolved

Apply suggestions from code review

23c2f5e

sven1977 changed the title ~~Custom metrics and callbacks complete episodes~~ [RLlib] Example script custom_metrics_and_callbacks.py should work for batch_mode=complete_episodes. Mar 1, 2022

sven1977 merged commit 568cf28 into ray-project:master Mar 1, 2022

simonsays1980 mentioned this pull request Mar 2, 2022

Custom metrics and callbacks complete episodes #22779

Closed

6 tasks

simonsays1980 added a commit to simonsays1980/ray that referenced this pull request Mar 2, 2022

Changed if-block to use the worker's 'policy_config' as suggested by @…

2a3666e

…sven1977 in PR ray-project#22684.

simonsays1980 mentioned this pull request Mar 9, 2022

complete_episodes breaks custom callback example [Bug] #22683

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Example script `custom_metrics_and_callbacks.py` should work for `batch_mode=complete_episodes`. #22684

[RLlib] Example script `custom_metrics_and_callbacks.py` should work for `batch_mode=complete_episodes`. #22684

simonsays1980 commented Feb 27, 2022 •

edited

Loading

gjoliver left a comment

gjoliver Feb 27, 2022

simonsays1980 Feb 27, 2022

sven1977 Mar 1, 2022

sven1977 Mar 1, 2022

simonsays1980 Mar 1, 2022

simonsays1980 commented Feb 27, 2022 •

edited

Loading

gjoliver left a comment

[RLlib] Example script custom_metrics_and_callbacks.py should work for batch_mode=complete_episodes. #22684

[RLlib] Example script custom_metrics_and_callbacks.py should work for batch_mode=complete_episodes. #22684

Conversation

simonsays1980 commented Feb 27, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

gjoliver left a comment

Choose a reason for hiding this comment

gjoliver Feb 27, 2022

Choose a reason for hiding this comment

simonsays1980 Feb 27, 2022

Choose a reason for hiding this comment

Trainer class

Custom usage

sven1977 Mar 1, 2022

Choose a reason for hiding this comment

sven1977 Mar 1, 2022

Choose a reason for hiding this comment

simonsays1980 Mar 1, 2022

Choose a reason for hiding this comment

simonsays1980 commented Feb 27, 2022 • edited Loading

gjoliver left a comment

Choose a reason for hiding this comment

[RLlib] Example script `custom_metrics_and_callbacks.py` should work for `batch_mode=complete_episodes`. #22684

[RLlib] Example script `custom_metrics_and_callbacks.py` should work for `batch_mode=complete_episodes`. #22684

simonsays1980 commented Feb 27, 2022 •

edited

Loading

simonsays1980 commented Feb 27, 2022 •

edited

Loading