Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" #4332

ericl · 2019-03-12T01:08:48Z

What do these changes do?

There seems to be a bug here, isolated to the changes in vtrace.py. Reverting that file and patching up multi_from_logits to call from_logits fixes the regression. Unfortunately I don't have time to investigate this further so we should revert this change and re-land it once the bug is found.

Interestingly, vtrace.py still passes its original unit tests.

The regression is fairly straightforward to reproduce: on any GPU machine run:

atari-impala:
    env:
        grid_search:
            - BreakoutNoFrameskip-v4
    run: IMPALA
    config:
        sample_batch_size: 50
        train_batch_size: 500
        num_workers: 32
        num_envs_per_worker: 5
        clip_rewards: True
        lr_schedule: [
            [0, 0.0005],
            [20000000, 0.000000000001],
        ]

Within a few minutes it gets to 1m+ timesteps. Before the regression, you can expect 10+ reward at this point. After the regression, the reward will top out at 1-2.

Related issue number

Atari performance regression originally reported in #4329

…3967)" This reverts commit 962b17f.

AmplabJenkins · 2019-03-12T01:10:06Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/36/
Test PASSed.

ericl · 2019-03-12T01:34:29Z

First time build. Skipping changelog.
[Ray-Perf-Integration-PRB] $ /bin/bash /tmp/hudson4942004597160488522.sh
PR 4332 does not contain ray-core, early exiting
Test PASSed.

@robertnishihara is this misdetection due to a revert or is it some other bug?

pcmoritz · 2019-03-12T01:39:26Z

@ericl This is the new performance testing jenkins, which only kicks in if the PR string contains ray-core afaik (@devin-petersohn), i.e. if something performance-related has changed. ray-core should probably be changed to core.

ericl · 2019-03-12T01:39:59Z

Ah ok, didn't realize it had a different name.

hartikainen · 2019-03-12T01:58:33Z

python/ray/rllib/agents/impala/vtrace.py

-        log_rhos = get_log_rhos(target_action_log_probs,
-                                behaviour_action_log_probs)
-
+        log_rhos = target_action_log_probs - behaviour_action_log_probs


One potential source of the bug is if this guy has a wrong shape.

I checked the shapes here and it looked reasonable (at least they match).

AmplabJenkins · 2019-03-12T04:11:25Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12765/
Test FAILed.

…project#3967)" (ray-project#4332)" This reverts commit 3c41cb9.

stefanpantic · 2019-03-12T14:37:42Z

@ericl Hi Eric, I think I fixed the issue, I've opened a pull request here

* Revert "Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" (#4332)" This reverts commit 3c41cb9. * Fix a bug with log rhos for vtrace * Reformat * lint

Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (ray-project#…

e4cb71f

…3967)" This reverts commit 962b17f.

ericl assigned hartikainen Mar 12, 2019

ericl added the regression label Mar 12, 2019

hartikainen approved these changes Mar 12, 2019

View reviewed changes

hartikainen reviewed Mar 12, 2019

View reviewed changes

ericl merged commit 3c41cb9 into ray-project:master Mar 12, 2019

stefanpantic added a commit to wingman-ai/ray that referenced this pull request Mar 12, 2019

Revert "Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (ray-…

a63e581

…project#3967)" (ray-project#4332)" This reverts commit 3c41cb9.

stefanpantic mentioned this pull request Mar 12, 2019

Fix multi discrete #4338

Merged

ericl pushed a commit that referenced this pull request Mar 13, 2019

Fix multi discrete (#4338)

2202a81

* Revert "Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" (#4332)" This reverts commit 3c41cb9. * Fix a bug with log rhos for vtrace * Reformat * lint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" #4332

Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" #4332

ericl commented Mar 12, 2019 •

edited

Loading

AmplabJenkins commented Mar 12, 2019

ericl commented Mar 12, 2019

pcmoritz commented Mar 12, 2019

ericl commented Mar 12, 2019

hartikainen Mar 12, 2019

ericl Mar 12, 2019

AmplabJenkins commented Mar 12, 2019

stefanpantic commented Mar 12, 2019

Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" #4332

Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" #4332

Conversation

ericl commented Mar 12, 2019 • edited Loading

What do these changes do?

Related issue number

AmplabJenkins commented Mar 12, 2019

ericl commented Mar 12, 2019

pcmoritz commented Mar 12, 2019

ericl commented Mar 12, 2019

hartikainen Mar 12, 2019

Choose a reason for hiding this comment

ericl Mar 12, 2019

Choose a reason for hiding this comment

AmplabJenkins commented Mar 12, 2019

stefanpantic commented Mar 12, 2019

ericl commented Mar 12, 2019 •

edited

Loading