Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State available in SampleBatch and ReplayBuffer #43

Open
wants to merge 6 commits into
base: releases/0.8.6
Choose a base branch
from

Conversation

Edilmo
Copy link

@Edilmo Edilmo commented Oct 1, 2020

Why are these changes needed?

Currently, for recurrent/recursive models, the state is only available for policy evaluation during training, but it's not available in the SampleBatch hence is not accessible at the execution plans level which in turn means that is not present in the replay buffer. So, apex-like algorithms can not use memory models right now in RLlib.

Here we are making the very first step towards supporting memory for this kind of algorithms.

Related issue number

None

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/latest/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failure rates at https://ray-travis-tracker.herokuapp.com/.
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested (please justify below)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant