-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7504] [YARN] NullPointerException when initializing SparkContext in YARN-cluster mode #6083
Changes from all commits
9f287c5
39e4fa3
4924e01
ea2a5fe
7c89b6e
926bd96
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -448,6 +448,7 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging { | |
} | ||
} | ||
} | ||
|
||
} | ||
|
||
/** | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -371,6 +371,14 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli | |
throw new SparkException("An application name must be set in your configuration") | ||
} | ||
|
||
// System property spark.yarn.app.id must be set if user code ran by AM on a YARN cluster | ||
// yarn-standalone is deprecated, but still supported | ||
if ((master == "yarn-cluster" || master == "yarn-standalone") && | ||
!_conf.contains("spark.yarn.app.id")) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting related question came up just a moment ago; is this supposed to be Given the discussion, this seems like a reasonable change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AM writes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yea, the suggestion was though that this should also be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. isn't checking for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an early and clean solution, rather than checking for the master. YARN cluster mode the SparkContext is launched as a separate user thread at container 0 and joined later. @srowen If you resolve it, I will remove There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's necessary because that's what differentiates "user instantiating SparkContext in yarn-cluster mode" and "ApplicationMaster running user code that instantiates SparkContext in yarn-cluster mode". If you remove that check, yarn-cluster mode will always fail this check. See this code in ApplicationMaster:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see... that's too bad. Maybe we should add a short comment to explain this cause it's not clear why we need it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What comment would you add? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's OK, I'll add it myself when I merge this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh never mind, I just noticed there's already a comment that explains this, though I was sure it wasn't there before... anyway, I merged this as is. |
||
throw new SparkException("Detected yarn-cluster mode, but isn't running on a cluster. " + | ||
"Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.") | ||
} | ||
|
||
if (_conf.getBoolean("spark.logConf", false)) { | ||
logInfo("Spark configuration:\n" + _conf.toDebugString) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can probably remove this on merge, but there's a stray blank here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I saw, usually there is a blank line before ending of long methods and classes. You might remove it if you will.