[RLlib] APPO Training iteration fn. #24545

sven1977 · 2022-05-06T16:21:34Z

APPO Training iteration fn.

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

avnishn

I have 1 question but otherwise lgtm

rllib/agents/ppo/appo.py

…_training_itr

avnishn · 2022-05-10T18:48:40Z

This is all looking pretty good, although is it learning?

…_training_itr

sven1977 · 2022-05-15T19:43:26Z

Hey @avnishn , please give this another go. I confirmed learning Pong is still good, e.g. on only 7 workers:

(base) ray@ip-172-31-83-222:~/riot_games_atari_benchmarks/ray$ tail ~/output.txt 
Resources requested: 0/8 CPUs, 0/1 GPUs, 0.0/35.56 GiB heap, 0.0/17.78 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ray/ray_results/pong-appo
Number of trials: 1/1 (1 TERMINATED)
+-------------------------------------+------------+---------------------+--------+------------------+---------+----------+----------------------+----------------------+--------------------+
| Trial name                          | status     | loc                 |   iter |   total time (s) |      ts |   reward |   episode_reward_max |   episode_reward_min |   episode_len_mean |
|-------------------------------------+------------+---------------------+--------+------------------+---------+----------+----------------------+----------------------+--------------------|
| APPO_PongNoFrameskip-v4_7fb9b_00000 | TERMINATED | 172.31.83.222:15223 |    154 |          1697.41 | 2931200 |    18.07 |                   20 |                    8 |            7679.47 |
+-------------------------------------+------------+---------------------+--------+------------------+---------+----------+----------------------+----------------------+--------------------+

…_training_itr

…a_zero_training_itr # Conflicts: # rllib/agents/slateq/slateq.py # rllib/algorithms/ars/README.md # rllib/algorithms/ars/ars.py # rllib/algorithms/ars/ars_tf_policy.py # rllib/algorithms/ars/ars_torch_policy.py # rllib/algorithms/ars/tests/test_ars.py # rllib/algorithms/es/es.py # rllib/algorithms/es/es_tf_policy.py # rllib/algorithms/es/es_torch_policy.py # rllib/algorithms/es/optimizers.py # rllib/algorithms/es/tests/test_es.py # rllib/algorithms/es/utils.py

…_training_itr

wip.

ec948bd

sven1977 requested review from gjoliver, avnishn, ArturNiederfahrenhorst, smorad and maxpumperla as code owners May 6, 2022 16:21

sven1977 assigned avnishn May 6, 2022

avnishn reviewed May 6, 2022

View reviewed changes

rllib/agents/ppo/appo.py Show resolved Hide resolved

sven1977 added 3 commits May 6, 2022 18:26

wip.

40af34d

wip

c3f1739

Merge branch 'master' of https://github.com/ray-project/ray into appo…

28c1965

…_training_itr

sven1977 added 3 commits May 11, 2022 16:41

Merge branch 'master' of https://github.com/ray-project/ray into appo…

3c78d18

…_training_itr

Merge branch 'master' of https://github.com/ray-project/ray into appo…

6d9039a

…_training_itr

wip.

0d7e0f1

sven1977 requested a review from kouroshHakha as a code owner May 12, 2022 08:33

sven1977 added 2 commits May 12, 2022 10:51

wip.

1f8726a

wip.

c5be3d3

avnishn approved these changes May 15, 2022

View reviewed changes

sven1977 added 10 commits May 16, 2022 08:29

merge

c5f5f22

Merge branch 'master' of https://github.com/ray-project/ray into appo…

c4f813c

…_training_itr

wip.

65078ca

LINT

ec60fa8

wip

18b16fa

LINT

852a608

wip

771e176

wip

10edc32

wip

c6705c2

sven1977 added 6 commits May 16, 2022 12:59

Merge branch 'master' of https://github.com/ray-project/ray into appo…

95a1668

…_training_itr

wip

ef3c784

Merge branch 'master' of https://github.com/ray-project/ray into appo…

aefbed2

…_training_itr

LINT

1961bc2

Merge branch 'master' of https://github.com/ray-project/ray into appo…

d200050

…_training_itr

wip

e223f2e

sven1977 merged commit 25001f6 into ray-project:master May 17, 2022

maxpumperla pushed a commit that referenced this pull request May 18, 2022

[RLlib] APPO Training iteration fn. (#24545)

3aa10f3

sven1977 deleted the appo_training_itr branch June 2, 2023 20:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] APPO Training iteration fn. #24545

[RLlib] APPO Training iteration fn. #24545

sven1977 commented May 6, 2022 •

edited

Loading

avnishn left a comment

avnishn commented May 10, 2022

sven1977 commented May 15, 2022

[RLlib] APPO Training iteration fn. #24545

[RLlib] APPO Training iteration fn. #24545

Conversation

sven1977 commented May 6, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

avnishn left a comment

Choose a reason for hiding this comment

avnishn commented May 10, 2022

sven1977 commented May 15, 2022

sven1977 commented May 6, 2022 •

edited

Loading