-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rllib] Fix torch c51 dqn #16716
[rllib] Fix torch c51 dqn #16716
Conversation
Hmm @Souphis it looks like Is it possible to investigate that failure? |
I don't think that this PR is related to this failure, but I will investigate this. |
@Souphis , thanks for this PR! Taking a look at the failing test. I think it's not related to this PR, though, so should be good, but I'll confirm. ... |
@sven1977 I ran tests once again locally, both td3 and dqn passed them. So, I think that this error is not related to this PR. |
Merged with master, which has a fix for this test case. Waiting for tests to pass again. ... |
@Souphis , agree. Will merge as soon as everything passes. Should be today :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes @Souphis !
Why are these changes needed?
Currently, it is not possible to train a c51 torch agent. The output shape of the value head in DqnTorchModel should be equal to the num_atoms (now it is equal to 1). Also in the current implementation, the noisy option doesn't affect the value head. There are also minor fixes in QLoss, like target probs should be detached before loss calculation.
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.