-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LIVY-702]: Submit Spark apps to Kubernetes #249
base: master
Are you sure you want to change the base?
Changes from all commits
758acb0
0f256ee
20638be
a89064c
d45cdbf
8f8be88
9a458b7
6a37bd3
7954481
52b45e1
06f14db
e6bc1a2
5c06992
769ec23
3f76512
b87c0ce
421a8f8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -169,6 +169,13 @@ private void initializeServer() throws Exception { | |
// on the cluster, it would be tricky to solve that problem in a generic way. | ||
livyConf.set(RPC_SERVER_ADDRESS, null); | ||
|
||
// If we are running on Kubernetes, set RPC_SERVER_ADDRESS from "spark.driver.host" option, | ||
// which is set in class org.apache.spark.deploy.k8s.features.DriverServiceFeatureStep: | ||
// line 61: val driverHostname = s"$resolvedServiceName.${kubernetesConf.namespace()}.svc" | ||
if (livyConf.isRunningOnKubernetes()) { | ||
livyConf.set(RPC_SERVER_ADDRESS, conf.get("spark.driver.host")); | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By some reason this version does not work for me like it was with the same piece from #167. I got the same exception as here: #167 (comment) I'm not sure why, but it seems like it's because the
But I'm not using code from your branch, but rather backporting your patch to our Livy build which is based on older Livy version, so maybe it's the cause. On the other hand, during quick lookup I've not found any code that bypass There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be related to the backporting indeed. Will be happy to help once you get more debug info. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've tried to use the latest Livy with your patches and the issue has not appeared. So it seems like I got that problem because I've backported something wrong. |
||
if (livyConf.getBoolean(TEST_STUCK_START_DRIVER)) { | ||
// Test flag is turned on so we will just infinite loop here. It should cause | ||
// timeout and we should still see yarn application being cleaned up. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -458,7 +458,9 @@ class InteractiveSession( | |
val driverProcess = client.flatMap { c => Option(c.getDriverProcess) } | ||
.map(new LineBufferedProcess(_, livyConf.getInt(LivyConf.SPARK_LOGS_SIZE))) | ||
|
||
if (livyConf.isRunningOnYarn() || driverProcess.isDefined) { | ||
if (livyConf.isRunningOnYarn() || driverProcess.isDefined | ||
// Create SparkKubernetesApp anyway to recover app monitoring on Livy server restart | ||
|| livyConf.isRunningOnKubernetes()) { | ||
Some(SparkApp.create(appTag, appId, driverProcess, livyConf, Some(this))) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is just the same line as 404 why is it in it's own There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ahh, nice catch, agree. EDIT: resolved |
||
} else { | ||
None | ||
|
@@ -540,6 +542,8 @@ class InteractiveSession( | |
transition(SessionState.ShuttingDown) | ||
sessionStore.remove(RECOVERY_SESSION_TYPE, id) | ||
client.foreach { _.stop(true) } | ||
// We need to call #kill here explicitly to delete Interactive pods from the cluster | ||
if (livyConf.isRunningOnKubernetes()) app.foreach(_.kill()) | ||
} catch { | ||
case _: Exception => | ||
app.foreach { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function may always return false, since "livy.spark.master" will not get by RSCConf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems like that's what I've faced in #249 (comment)
However, it seems like it's not affecting functionality, as this function is used while setting
RPC_SERVER_ADDRESS
here:https://github.com/apache/incubator-livy/pull/249/files/b87c0cebb65ce7f34e6b4b6b738095be6254cf69#diff-43114318c4b009c2404f7eb326a84c184fb1501a3237c49a771df851d0f6f328R172-R178
And the value of
RPC_SERVER_ADDRESS
is not used anyway since Livy 0.7 because of things I've explained in #388.