[RLLib] Fix OneHotPreprocessor, use gym.spaces.utils.flatten #27540

olipinski · 2022-08-05T08:10:09Z

Why are these changes needed?

As per #27496, the current implementation of OneHotPreprocessor loses information when one-hot encoding. Using the gym implementation should avoid this issue.

Related issue number

Closes #27496.

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

gjoliver · 2022-08-05T16:45:37Z

nice! thanks a ton for the fix.
can you help add a quick unit test in rllib/model/tests/test_preprocess.py for this behavior?
the team is really trying to push for better quality and testing in the long terms, fix like this is perfect for adding a unit test so there won't be regression.

olipinski · 2022-08-08T08:50:16Z

can you help add a quick unit test in rllib/model/tests/test_preprocess.py for this behavior?

I've added a quick test just now, both for 2D and 3D multidiscretes!

olipinski · 2022-08-30T10:10:06Z

Do we need to do anything else here?

gjoliver

thanks a ton for the fix.
I will ping @sven1977 to get this merged.

olipinski · 2022-09-01T08:59:40Z

Thanks for the approve!

Signed-off-by: Olaf Lipinski <[email protected]>

sven1977 · 2022-09-07T08:13:28Z

Thanks for the PR @olipinski , could you make sure the autoregressive_action_dist test case passes? There seems to be a connection to your changes:

https://buildkite.com/ray-project/ray-builders-pr/builds/43759#0182fa77-a584-4bef-a8ef-91f7e0ff4279

The script that fails is rllib/examples/autoregressive_action_dist.py run with the --framework torch flag.

Signed-off-by: Olaf Lipinski <[email protected]>

olipinski · 2022-09-07T09:24:47Z

https://buildkite.com/ray-project/ray-builders-pr/builds/43759#0182fa77-a584-4bef-a8ef-91f7e0ff4279

The script that fails is rllib/examples/autoregressive_action_dist.py run with the --framework torch flag.

This issue appears to be due to torch.nn.linear supporting only float32, which the old code used to cast every one hotted action into. I have added the cast into the preprocessor, though I'm uncertain if that should be done there, or should the dtype be specified in the environment?

gjoliver · 2022-09-08T08:40:13Z

I think this is fine. thanks 👍

olipinski · 2022-09-08T08:50:32Z

Perfect, should be all good then!

sven1977 · 2022-09-09T13:52:07Z

Thanks again @olipinski , and @gjoliver for the review!

…oject#27540) Signed-off-by: ilee300a <[email protected]>

…oject#27540)

olipinski requested review from sven1977, gjoliver, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla, kouroshHakha and krfricke as code owners August 5, 2022 08:10

olipinski changed the title ~~Fix OneHotPreprocessor, use gym.spaces.utils.flatten~~ [RLLib] Fix OneHotPreprocessor, use gym.spaces.utils.flatten Aug 5, 2022

gjoliver approved these changes Aug 30, 2022

View reviewed changes

gjoliver force-pushed the preprocess-multi-discrete branch from d5667ad to 65b77a7 Compare August 30, 2022 16:48

olipinski and others added 3 commits September 1, 2022 11:32

Fix OneHotPreprocessor, use gym.spaces.utils.flatten

6748009

Signed-off-by: Olaf Lipinski <[email protected]>

Add unit tests for multidimensional multidiscretes

77a6f32

Signed-off-by: Olaf Lipinski <[email protected]>

ci

93cb3d0

gjoliver force-pushed the preprocess-multi-discrete branch from 5b4babc to 93cb3d0 Compare September 1, 2022 18:32

Fix issue with auoregressive torch model

efb8f39

Signed-off-by: Olaf Lipinski <[email protected]>

sven1977 merged commit 3dadc74 into ray-project:master Sep 9, 2022

olipinski deleted the preprocess-multi-discrete branch September 12, 2022 09:41

ilee300a pushed a commit to ilee300a/ray that referenced this pull request Sep 12, 2022

[RLlib] Fix OneHotPreprocessor, use gym.spaces.utils.flatten. (ray-pr…

c091c14

…oject#27540) Signed-off-by: ilee300a <[email protected]>

justinvyu pushed a commit to justinvyu/ray that referenced this pull request Sep 14, 2022

[RLlib] Fix OneHotPreprocessor, use gym.spaces.utils.flatten. (ray-pr…

c3cb0cb

…oject#27540)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLLib] Fix OneHotPreprocessor, use gym.spaces.utils.flatten #27540

[RLLib] Fix OneHotPreprocessor, use gym.spaces.utils.flatten #27540

olipinski commented Aug 5, 2022

gjoliver commented Aug 5, 2022

olipinski commented Aug 8, 2022

olipinski commented Aug 30, 2022

gjoliver left a comment

olipinski commented Sep 1, 2022

sven1977 commented Sep 7, 2022

olipinski commented Sep 7, 2022

gjoliver commented Sep 8, 2022

olipinski commented Sep 8, 2022

sven1977 commented Sep 9, 2022

[RLLib] Fix OneHotPreprocessor, use gym.spaces.utils.flatten #27540

[RLLib] Fix OneHotPreprocessor, use gym.spaces.utils.flatten #27540

Conversation

olipinski commented Aug 5, 2022

Why are these changes needed?

Related issue number

Checks

gjoliver commented Aug 5, 2022

olipinski commented Aug 8, 2022

olipinski commented Aug 30, 2022

gjoliver left a comment

Choose a reason for hiding this comment

olipinski commented Sep 1, 2022

sven1977 commented Sep 7, 2022

olipinski commented Sep 7, 2022

gjoliver commented Sep 8, 2022

olipinski commented Sep 8, 2022

sven1977 commented Sep 9, 2022