Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Add correct terminated and truncated batch sizes on zero-length episodes #46721

Merged
merged 2 commits into from
Sep 25, 2024

Conversation

Mark2000
Copy link
Contributor

@Mark2000 Mark2000 commented Jul 20, 2024

Why are these changes needed?

In AddColumnsFromEpisodesToTrainBatch, it is assumed that each sa_episode has length ≥ 1. When a zero-length episode is passed to the connector, the data added to the terminateds and truncateds columns are incorrectly sized; they should just be an empty list. This case generally doesn't come up, but when adding custom connectors that modify episode lengths (e.g. for semi-MDP type problems), zero-length episodes can be produced.

Related issue number

Didn't open issue, but I can if necessary.

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@Mark2000 Mark2000 changed the title Add correct terminated and truncated batch sizes on zero-length episodes [RLlib] Add correct terminated and truncated batch sizes on zero-length episodes Jul 24, 2024
@Mark2000
Copy link
Contributor Author

Bumping this!

@Mark2000
Copy link
Contributor Author

@sven1977 Could you review when you're able to?

@anyscalesam anyscalesam added triage Needs triage (eg: priority, bug/not-bug, and owning component) rllib RLlib related issues labels Aug 12, 2024
@Mark2000
Copy link
Contributor Author

Mark2000 commented Sep 3, 2024

Sorry to keep bugging you @sven1977 @simonsays1980 , but any way this could get merged in?

@@ -100,6 +100,8 @@ def __call__(
Columns.TERMINATEDS,
items_to_add=(
[False] * (len(sa_episode) - 1) + [sa_episode.is_terminated]
if len(sa_episode) > 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably remove any zero-length episodes in the loop here. They cannot be used for learning anyways.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave this to @sven1977 to decide.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sven1977 could we give this a go? I think this is fine like this. The other option is to remove zero-length epiosdes entirely as they offer nothing to learn from. This could also happen if an agent in a multi-agent scenario received only the initial observation and nothing else.

Copy link
Contributor

@sven1977 sven1977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @Mark2000 . LGTM.

A slightly more defensive fix would probably be to filter out empty episodes already in the connector pipeline, but maybe that would confuse some code that does draw information from those.

Our built-in EnvRunners do not return empty episodes ever, which is why this problem never came up.

@sven1977 sven1977 enabled auto-merge (squash) September 25, 2024 13:29
@sven1977
Copy link
Contributor

auto-merge enabled ...

@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Sep 25, 2024
@sven1977 sven1977 merged commit 4a5207d into ray-project:master Sep 25, 2024
7 checks passed
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests rllib RLlib related issues rllib-connectorv2 Connector related issues triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants