[RLlib] Move learning_starts logic into execution plans #26032

ArturNiederfahrenhorst · 2022-06-23T10:27:14Z

Why are these changes needed?

learning_starts should be renamed to something more descriptive: num_steps_sampled_before_learning_starts
Should be moved out of replay buffer config according to our philosophy: Algorithm should define what should happen when, but NOT how it should happen. num_steps_sampled_before_learning_starts answers the "when" and "what" questions and should thus be handled and configured on the top Algo level (not inside replay buffers).

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…ningstarts

rllib/algorithms/ddpg/tests/test_ddpg.py

rllib/algorithms/dqn/dqn.py

rllib/algorithms/dreamer/dreamer.py

rllib/algorithms/qmix/qmix.py

rllib/algorithms/simple_q/simple_q.py

rllib/utils/replay_buffers/tests/test_multi_agent_prioritized_replay_buffer.py

rllib/utils/replay_buffers/utils.py

kouroshHakha

Oh man!! this must have been a tough PR. Thanks a lot for this. There are a couple of major suggestions / todos before it can be merged. We can sync up over slack if you need more clarification. Also one more good PR practice that should become a habit: Please, Please let the reviewer resolve their own comments (If you mark them as resolved then the reviewer may miss their previous comment and it may indeed still be unresolved). It's also mandatory for the reviewer to also mark those comments that have been addressed as resolved and only leave those that are still open issues / not fully addressed. This way the communication is less prone to errors.

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

kouroshHakha · 2022-08-10T06:17:44Z

@ArturNiederfahrenhorst This PR still needs to address all the TODOs

Signed-off-by: Artur Niederfahrenhorst <[email protected]>

kouroshHakha

LGTM once test pass.

…`. (ray-project#26032) Signed-off-by: Stefan van der Kleij <[email protected]>

ArturNiederfahrenhorst added 3 commits June 23, 2022 12:13

renaming

996042e

renaming

1a3f972

lint

6ec6d7c

ArturNiederfahrenhorst requested review from sven1977, gjoliver, avnishn, smorad, maxpumperla, kouroshHakha and krfricke as code owners June 23, 2022 10:27

ArturNiederfahrenhorst mentioned this pull request Jun 23, 2022

[RLlib] Better description for learning_starts parameter #23969

Closed

6 tasks

ArturNiederfahrenhorst and others added 19 commits June 23, 2022 14:22

utils fix

fb6f2a0

learning-starts occurences

fd022cf

num_ts_added_before_sampling_starts in recsim example

9d32f06

Adds extensive description to all occurences

4c75280

typo

ef2396a

Merge branch 'master' into learningstarts

e926355

fix maddpg test config

90cec5c

merge master again

5633595

Merge branch 'master' of https://github.com/ray-project/ray into lear…

2224a0b

…ningstarts

wip

0254a46

Merge branch 'master' of https://github.com/ray-project/ray into lear…

af3e899

…ningstarts

LINT

a9aed3d

Merge branch 'master' of https://github.com/ray-project/ray into lear…

daf408c

…ningstarts

wip

ad22d3a

wip

bfc5f08

merge master

7a41dcd

rename to min_size

777ffb2

fix recsim example

6b5d9cd

typo

9ff64ed