-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable autograd graph to propagate after multi-device syncing (for loss functions in ddp
)
#2754
base: master
Are you sure you want to change the base?
Conversation
That sounds good to me, but can we add a test for this enhancement? |
Thanks for the prompt response @Borda. I'm thinking that I can make an additional unittest in |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2754 +/- ##
=======================================
- Coverage 69% 69% -0%
=======================================
Files 329 316 -13
Lines 18077 17914 -163
=======================================
- Hits 12496 12336 -160
+ Misses 5581 5578 -3 |
yeah, that sounds good to me :) |
1ba6fb3
to
6598ab8
Compare
6c926d7
to
1d0dabe
Compare
Update: to accommodate both cases where tensors from different ranks have the same/different shape, the line to put the original tensor (holding the AD graph) back into the gathered list was added in two places in the code. Because of the two cases, I wrote two unittests to account for each. Interestingly, both pass |
that is strange and worse some more investigation... |
What does this PR do?
Single-line enhancement proposed in #2745, that is, to enable the propagation of the autograd graph after the
all_gather
operation. This is useful for syncing loss functions in addp
setting.Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃
📚 Documentation preview 📚: https://torchmetrics--2754.org.readthedocs.build/en/2754/