Replies: 1 comment
-
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is a known PyTorch issue that many of our users have experienced. Here are the steps to resolve it.
How to identify?
The tell tell sign of this issue is the code hangs indefinitely in baton.wait() command of torch/utils/cpp_extension.py while doing just in time compilation of a cuda code.
Why it happens?
This seems to happen if the python thread gets killed for some reason during cpp/cuda code compilation and PyTorch is unable to clear the lock file on cpp extension.
How to fix?
Follow the fix here: zhou13/neurvps#1 (comment)
Beta Was this translation helpful? Give feedback.
All reactions