-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-18751][Core]Fix deadlock when SparkContext.stop is called in Utils.tryOrStopSparkContext #16178
Conversation
cc @rxin |
_stop() | ||
} | ||
override def run(): Unit = { | ||
SparkContext.this.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this ever throw an exception? Should we register an UncaughtExceptionHandler
or try catch with logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this ever throw an exception? Should we register an UncaughtExceptionHandler or try catch with logging?
This happens in the driver, so we cannot use SparkUncaughtExceptionHandler
to catch the error. The error will be sent to the user's UncaughtExceptionHandler if specified or just print to stderr.
/** | ||
* Shut down the SparkContext. | ||
*/ | ||
def stop() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a proper signature (...: Unit = {
)
Test build #69750 has finished for PR 16178 at commit
|
Test build #69751 has finished for PR 16178 at commit
|
Test build #69821 has finished for PR 16178 at commit
|
LGTM |
Thanks! Merging to master and 2.1. |
…Utils.tryOrStopSparkContext ## What changes were proposed in this pull request? When `SparkContext.stop` is called in `Utils.tryOrStopSparkContext` (the following three places), it will cause deadlock because the `stop` method needs to wait for the thread running `stop` to exit. - ContextCleaner.keepCleaning - LiveListenerBus.listenerThread.run - TaskSchedulerImpl.start This PR adds `SparkContext.stopInNewThread` and uses it to eliminate the potential deadlock. I also removed my changes in #15775 since they are not necessary now. ## How was this patch tested? Jenkins Author: Shixiong Zhu <[email protected]> Closes #16178 from zsxwing/fix-stop-deadlock. (cherry picked from commit 26432df) Signed-off-by: Shixiong Zhu <[email protected]>
…Utils.tryOrStopSparkContext ## What changes were proposed in this pull request? When `SparkContext.stop` is called in `Utils.tryOrStopSparkContext` (the following three places), it will cause deadlock because the `stop` method needs to wait for the thread running `stop` to exit. - ContextCleaner.keepCleaning - LiveListenerBus.listenerThread.run - TaskSchedulerImpl.start This PR adds `SparkContext.stopInNewThread` and uses it to eliminate the potential deadlock. I also removed my changes in apache#15775 since they are not necessary now. ## How was this patch tested? Jenkins Author: Shixiong Zhu <[email protected]> Closes apache#16178 from zsxwing/fix-stop-deadlock.
…Utils.tryOrStopSparkContext ## What changes were proposed in this pull request? When `SparkContext.stop` is called in `Utils.tryOrStopSparkContext` (the following three places), it will cause deadlock because the `stop` method needs to wait for the thread running `stop` to exit. - ContextCleaner.keepCleaning - LiveListenerBus.listenerThread.run - TaskSchedulerImpl.start This PR adds `SparkContext.stopInNewThread` and uses it to eliminate the potential deadlock. I also removed my changes in apache#15775 since they are not necessary now. ## How was this patch tested? Jenkins Author: Shixiong Zhu <[email protected]> Closes apache#16178 from zsxwing/fix-stop-deadlock.
What changes were proposed in this pull request?
When
SparkContext.stop
is called inUtils.tryOrStopSparkContext
(the following three places), it will cause deadlock because thestop
method needs to wait for the thread runningstop
to exit.This PR adds
SparkContext.stopInNewThread
and uses it to eliminate the potential deadlock. I also removed my changes in #15775 since they are not necessary now.How was this patch tested?
Jenkins