-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6014] [core] Revamp Spark shutdown hooks, fix shutdown races. #5560
Conversation
This change adds some new utility code to handle shutdown hooks in Spark. The main goal is to take advantage of Hadoop 2.x's API for shutdown hooks, which allows Spark to register a hook that will run before the one that cleans up HDFS clients, and thus avoids some races that would cause exceptions to show up and other issues such as failure to properly close event logs. Unfortunately, Hadoop 1.x does not have such APIs, so in that case correctness is still left to chance.
Test build #30494 has finished for PR 5560 at commit
|
/** | ||
* Adds a shutdown hook with default priority. | ||
*/ | ||
def addShutdownHook(hook: () => Unit): AnyRef = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looked weird to return AnyRef
, but I suppose that while callers need a handle to the hook, they don't need to be promised anything about what it is.
Yep, I like this. It cleans up a kind of gnarly aspect and centralizes handling cleanly. It integrates with a similar mechanism in Hadoop if it exists, and that in turn fixes some actual small problems we encounter at shutdown. Needs a rebase though. |
Conflicts: yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
Test build #30598 has finished for PR 5560 at commit
|
This change adds some new utility code to handle shutdown hooks in Spark. The main goal is to take advantage of Hadoop 2.x's API for shutdown hooks, which allows Spark to register a hook that will run before the one that cleans up HDFS clients, and thus avoids some races that would cause exceptions to show up and other issues such as failure to properly close event logs. Unfortunately, Hadoop 1.x does not have such APIs, so in that case correctness is still left to chance. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#5560 from vanzin/SPARK-6014 and squashes the following commits: edfafb1 [Marcelo Vanzin] Better scaladoc. fcaeedd [Marcelo Vanzin] Merge branch 'master' into SPARK-6014 e7039dc [Marcelo Vanzin] [SPARK-6014] [core] Revamp Spark shutdown hooks, fix shutdown races.
This change adds some new utility code to handle shutdown hooks in
Spark. The main goal is to take advantage of Hadoop 2.x's API for
shutdown hooks, which allows Spark to register a hook that will
run before the one that cleans up HDFS clients, and thus avoids
some races that would cause exceptions to show up and other issues
such as failure to properly close event logs.
Unfortunately, Hadoop 1.x does not have such APIs, so in that case
correctness is still left to chance.