[RLlib] Bandit documentation enhancements. #22427

sven1977 · 2022-02-16T10:15:55Z

Bandit documentation enhancements.

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

gjoliver

thanks for the update. just a minor suggestion.

gjoliver · 2022-02-17T07:08:01Z

doc/source/rllib/rllib-algorithms.rst

 Contextual Bandits
 ~~~~~~~~~~~~~~~~~~

 The Multi-armed bandit (MAB) problem provides a simplified RL setting that
-involves learning to act under one situation only, i.e. the state is fixed.
-Contextual bandit is extension of the MAB problem, where at each
+involves learning to act under one situation only, i.e. the observation/state is fixed.


from this paper (http://rob.schapire.net/papers/www10.pdf) MAB is a special case of contextual bandit where the context (user) and the arms are both fixed.
maybe we can say "i.e., the context and arms are both fixed."

Perfect, thanks for the hint and the review @gjoliver ! Will add this before merging.

doc/source/rllib/rllib-algorithms.rst

wip.

3580192

sven1977 requested a review from gjoliver February 16, 2022 10:16

sven1977 assigned gjoliver Feb 16, 2022

sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Feb 16, 2022

gjoliver approved these changes Feb 17, 2022

View reviewed changes

sven1977 commented Feb 17, 2022

View reviewed changes

doc/source/rllib/rllib-algorithms.rst Outdated Show resolved Hide resolved

Apply suggestions from code review

0d49f4d

sven1977 merged commit e03606f into ray-project:master Feb 17, 2022

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Feb 27, 2022

[RLlib] Bandit documentation enhancements. (ray-project#22427)

e55cdfb

sven1977 deleted the bandits_added_to_algo_table branch June 2, 2023 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Bandit documentation enhancements. #22427

[RLlib] Bandit documentation enhancements. #22427

sven1977 commented Feb 16, 2022 •

edited

Loading

gjoliver left a comment

gjoliver Feb 17, 2022

sven1977 Feb 17, 2022

[RLlib] Bandit documentation enhancements. #22427

[RLlib] Bandit documentation enhancements. #22427

Conversation

sven1977 commented Feb 16, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

gjoliver left a comment

Choose a reason for hiding this comment

gjoliver Feb 17, 2022

Choose a reason for hiding this comment

sven1977 Feb 17, 2022

Choose a reason for hiding this comment

sven1977 commented Feb 16, 2022 •

edited

Loading