Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Episode start flag is never set for off policy algorithms #2011

Open
5 tasks done
josndan opened this issue Sep 20, 2024 · 1 comment
Open
5 tasks done

[Bug]: Episode start flag is never set for off policy algorithms #2011

josndan opened this issue Sep 20, 2024 · 1 comment
Labels
question Further information is requested

Comments

@josndan
Copy link

josndan commented Sep 20, 2024

🐛 Bug

In _sample_action of OffPolicyAlgorithm class, self.predict function is called. But episode_start flag is never set for any off policy algorithms.

To Reproduce

No response

Relevant log output / Error message

No response

System Info

No response

Checklist

  • My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • I have provided a minimal and working example to reproduce the bug
  • I've used the markdown code blocks for both code and stack traces.
@josndan josndan added the bug Something isn't working label Sep 20, 2024
@araffin araffin added question Further information is requested and removed bug Something isn't working labels Sep 21, 2024
@araffin
Copy link
Member

araffin commented Sep 22, 2024

Hello,
that's correct because there is current only RecurrentPPO that make use of states (LSTM states) and episode starts (to reset the states).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants