[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1). #22465

sven1977 · 2022-02-17T12:27:42Z

Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1).

Removes some TODOs left by the original authors at amazon.
For the simple test cases we have, batching does not improve things, as it breaks the on-line assumption. However, for larger action spaces and bandit models, distributing and batching/vectorization could be very useful.

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

gjoliver

very nice PR. thanks, just 1 quick suggestion about comment.

gjoliver · 2022-02-17T18:00:33Z

rllib/agents/bandit/bandit_torch_model.py

-        self.arm.partial_fit(x[:, action_id], y)
+        for i, arm in enumerate(arms):
+            action_id = arm.item()
+            self.arm.partial_fit(x[[i], action_id], y[[i]])


maybe we can comment here for ParametricLinearModel, x are all the potential arms, arms tells us which one was pulled instead. and the features pointed at by arms should be used to train the actual model "arm" ...
I know it's twisted, maybe it's better left un-explained ...

…its_batched

…s + multiple workers + train_batch_sizes > 1). (#22465)" This reverts commit c58cd90.

…s + multiple workers + train_batch_sizes > 1). (ray-project#22465)" This reverts commit c58cd90.

…iple workers + train_batch_sizes > 1). (ray-project#22465)

sven1977 added 3 commits February 17, 2022 12:42

wip.

2a471ae

wip.

d10e895

wip.

59a5748

sven1977 requested review from gjoliver and avnishn as code owners February 17, 2022 12:27

sven1977 assigned avnishn and gjoliver and unassigned avnishn Feb 17, 2022

gjoliver approved these changes Feb 17, 2022

View reviewed changes

sven1977 added 2 commits February 17, 2022 20:04

wip.

717e864

Merge branch 'master' of https://github.com/ray-project/ray into band…

ccfee56

…its_batched

sven1977 merged commit c58cd90 into ray-project:master Feb 17, 2022

fishbone added a commit that referenced this pull request Feb 18, 2022

Revert "[RLlib] Enable Bandits to work in batches mode(s) (vector env…

f277366

…s + multiple workers + train_batch_sizes > 1). (#22465)" This reverts commit c58cd90.

fishbone mentioned this pull request Feb 18, 2022

Revert "[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1)." #22490

Closed

rkooo567 added a commit to rkooo567/ray that referenced this pull request Feb 18, 2022

Revert "[RLlib] Enable Bandits to work in batches mode(s) (vector env…

1615b49

…s + multiple workers + train_batch_sizes > 1). (ray-project#22465)" This reverts commit c58cd90.

rkooo567 mentioned this pull request Feb 18, 2022

Revert "[RLlib] Enable Bandits to work in batches mode(s) (vector env… #22497

Closed

6 tasks

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Feb 27, 2022

[RLlib] Enable Bandits to work in batches mode(s) (vector envs + mult…

c204de2

…iple workers + train_batch_sizes > 1). (ray-project#22465)

sven1977 deleted the bandits_batched branch June 2, 2023 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1). #22465

[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1). #22465

sven1977 commented Feb 17, 2022 •

edited

Loading

gjoliver left a comment

gjoliver Feb 17, 2022

[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1). #22465

[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1). #22465

Conversation

sven1977 commented Feb 17, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

gjoliver left a comment

Choose a reason for hiding this comment

gjoliver Feb 17, 2022

Choose a reason for hiding this comment

sven1977 commented Feb 17, 2022 •

edited

Loading