Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7504] [YARN] NullPointerException when initializing SparkContext in YARN-cluster mode #6083

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions core/src/main/scala/org/apache/spark/SparkConf.scala
Original file line number Diff line number Diff line change
Expand Up @@ -448,6 +448,7 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging {
}
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can probably remove this on merge, but there's a stray blank here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I saw, usually there is a blank line before ending of long methods and classes. You might remove it if you will.

}

/**
Expand Down
8 changes: 8 additions & 0 deletions core/src/main/scala/org/apache/spark/SparkContext.scala
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,14 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
throw new SparkException("An application name must be set in your configuration")
}

// System property spark.yarn.app.id must be set if user code ran by AM on a YARN cluster
// yarn-standalone is deprecated, but still supported
if ((master == "yarn-cluster" || master == "yarn-standalone") &&
!_conf.contains("spark.yarn.app.id")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting related question came up just a moment ago; is this supposed to be spark.yarn.app.id? someone just suggested that it should always be spark.app.id. Thought I'd ask if you guys know.

Given the discussion, this seems like a reasonable change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AM writes spark.yarn.app.id, to system properties as well. spark.app.id is something different.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, the suggestion was though that this should also be spark.app.id, since the two usages of spark.yarn.app.id are the only two in the code base: #6120 It might be worth resolving that first since it will matter to this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't checking for master enough? It seems that checking for the app ID is not necessary here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an early and clean solution, rather than checking for the master. YARN cluster mode the SparkContext is launched as a separate user thread at container 0 and joined later.

@srowen If you resolve it, I will remove yarn from the configuration parameter, but you can do it afterwards. Just to keep pull-request timeline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's necessary because that's what differentiates "user instantiating SparkContext in yarn-cluster mode" and "ApplicationMaster running user code that instantiates SparkContext in yarn-cluster mode". If you remove that check, yarn-cluster mode will always fail this check. See this code in ApplicationMaster:

    // Set the master property to match the requested mode.
    System.setProperty("spark.master", "yarn-cluster")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... that's too bad. Maybe we should add a short comment to explain this cause it's not clear why we need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What comment would you add?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK, I'll add it myself when I merge this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh never mind, I just noticed there's already a comment that explains this, though I was sure it wasn't there before... anyway, I merged this as is.

throw new SparkException("Detected yarn-cluster mode, but isn't running on a cluster. " +
"Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.")
}

if (_conf.getBoolean("spark.logConf", false)) {
logInfo("Spark configuration:\n" + _conf.toDebugString)
}
Expand Down