Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] [Release tests] Make rollout fragment length auto for release tests #30079

Merged

Conversation

avnishn
Copy link
Member

@avnishn avnishn commented Nov 8, 2022

SAC, DDPG, and CQL release tests were broken since we started enforcing that the rollout_fragment_length must be greater than or equal to n_step when computing n step returns.

This error was triggered I think by #29854 and eventually propagated to the release tests. Its likely that validation wasn't being run properly before #29854 .

Signed-off-by: Avnish [email protected]

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@@ -10,7 +10,7 @@ ddpg-hopperbulletenv-v0:
config:
actor_hiddens: [256, 256]
critic_hiddens: [256, 256]
n_step: 3
n_step: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, too, remove rollout_fragment_length (or set to "auto") further below entirely.

@@ -26,7 +26,7 @@ cql-halfcheetahbulletenv-v0:
tau: 0.005
target_entropy: auto
no_done_at_end: false
n_step: 3
n_step: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, let's do this the other way around and keep the n_step setting.

So the new config logic is: rollout_fragment_length is "auto" b/c it's a lot of times dependent on other settings, like train_batch_size, num_envs_per_worker, and num_rollout_workers (PPO) or just needs to be at least n_step to make sense.

The RolloutWorkers now infer this setting from these other settings, if "auto", otherwise, take the user provided values, but this could lead to errors, like in this case.

So let's just remove the rollout_fragment_length setting entirely here (or set to its default: "auto").

@@ -19,7 +19,7 @@ sac-halfcheetahbulletenv-v0:
tau: 0.005
target_entropy: auto
no_done_at_end: false
n_step: 3
n_step: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, too, remove rollout_fragment_length (or set to "auto") further below entirely.

@avnishn
Copy link
Member Author

avnishn commented Nov 8, 2022

okie sg :)

@avnishn
Copy link
Member Author

avnishn commented Nov 8, 2022

So I'm going to try this, but I suspect that we might need to increase the timestep limits on these tests in response, in order to meet the minimum reward threshold.

Copy link
Contributor

@sven1977 sven1977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks for the investigation and fixes @avnishn !

@avnishn
Copy link
Member Author

avnishn commented Nov 10, 2022

lets merge pending the release tests all getting kicked off successfully:
https://buildkite.com/ray-project/release-tests-pr/builds/20894#01846343-98ec-4d32-90c2-784fbd358a09

@avnishn avnishn changed the title [RLlib] [Release tests] Rollout fragment length must be greater than or equal to n_step [RLlib] [Release tests] Make rollout fragment length auto for release tests Nov 14, 2022
@gjoliver gjoliver merged commit 54c532f into ray-project:master Nov 14, 2022
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants