-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clone() syscall infinitely restarts because of SIGPROF signals #97
Comments
Interesting. Can you reproduce the issue on a reduced test case? (to make sure this is exactly the problem you referred to). |
Let's setup a minimal reproduce case first then. I will post back when I get one. |
Sorry, I tried to setup a minimal reproduce case in JVM only, however the scenario I described cannot be reproduced. I reproduced the case by simplify my real workload, it's indeed hangs at clone. I cannot post my workload here, but the important debug process can be shared: After I found the thread hanging forever, I use
It looks that the |
Thank you for a great analysis! Feel free to add a README paragraph or leave it to me if you prefer. |
I will send a PR for for the README then. |
Let me close this one. Thanks again for pointing out this issue. |
… to latest sbt version (#148) * Add optional sampling interval parameter for Async profiler to avoid issues like: async-profiler/async-profiler#97 * Switch to latest sbt version
I was skimming through vmprof-python's code today, and I found this solution that they've implemented for the same issue. I suppose it is something we could add in async-profiler as well (as an opt-in feature). |
When profiling a Spark application with large memory and subprocess execution(launch a subprocess in the JVM or native library side), the whole process was hanging at fork forever.
After some debugging, I believe it's similar with https://bugzilla.redhat.com/show_bug.cgi?id=645528 .
And the workaround is simple: increase interval to 20ms.
You can add this to the README or I can send a pr for this
The text was updated successfully, but these errors were encountered: