-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Prioritized version of an episode-based replay buffer. #42832
[RLlib] Prioritized version of an episode-based replay buffer. #42832
Conversation
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
…change 'self_indices' from 'list' to ' dict' to lookup indices at indices of 'SegmentTree's. Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
…Buffer' and added 'update_priorities()', 'set_state()' and 'get_state()'. Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
…used. It needs a multi-agent version of the episode replay buffer, however. Signed-off-by: Simon Zehnder <[email protected]>
@@ -21,6 +23,38 @@ | |||
logger = logging.getLogger(__name__) | |||
|
|||
|
|||
def update_priorities_in_episode_replay_buffer( | |||
replay_buffer: EpisodeReplayBuffer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: PrioritizedEpisodeReplayBuffer ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically yes, but if it is not a prioritized buffer the function just skips and returns. Like this we can use normal buffers or prioritized ones. The other possibility is to test for the kind of buffer in the algorithms.
Signed-off-by: Simon Zehnder <[email protected]>
rllib/utils/replay_buffers/prioritized_episode_replay_buffer.py
Outdated
Show resolved
Hide resolved
rllib/utils/replay_buffers/prioritized_episode_replay_buffer.py
Outdated
Show resolved
Hide resolved
rllib/utils/replay_buffers/prioritized_episode_replay_buffer.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Simon Zehnder <[email protected]>
Co-authored-by: Sven Mika <[email protected]> Signed-off-by: simonsays1980 <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Why are these changes needed?
Changing to the new
EnvRunner API
for sampling and therein to focus on episodes instead of batches, we need to have replay buffers that can handle episodes. TheEpisodeReplayBuffer
was a simple replay buffer to deal with sequences, but offers no prioritized replay. This PR wants to change that by adding a prioritized replay buffer based on storing episodes and replay the transitions within in a prioritized way.Related issue number
#42369
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.