-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21694][MESOS] Support Mesos CNI network labels #18910
Conversation
Test build #80498 has finished for PR 18910 at commit
|
@@ -21,6 +21,7 @@ import org.apache.mesos.Protos.{ContainerInfo, Image, NetworkInfo, Parameter, Vo | |||
import org.apache.mesos.Protos.ContainerInfo.{DockerInfo, MesosInfo} | |||
|
|||
import org.apache.spark.{SparkConf, SparkException} | |||
import org.apache.spark.deploy.mesos.config._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to avoid wildcard import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this makes it more consistent with MesosClusterScheduler
@@ -162,7 +163,11 @@ private[mesos] object MesosSchedulerBackendUtil extends Logging { | |||
} | |||
|
|||
conf.getOption("spark.mesos.network.name").map { name => | |||
val info = NetworkInfo.newBuilder().setName(name).build() | |||
val networkLabels = MesosProtoUtils.mesosLabels(conf.get(NETWORK_LABELS).getOrElse("")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not part of this PR but...shouldn't spark.mesos.network.name
have a config name like labels have (NETWORK_LABELS
)?
At some point I think all mesos specific config options should be placed in the config file to be easily managed and referenced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for making the NETWORK_LABELS
alias.
@susanxhuynh have you tested how spark jobs behave with a real cni network this PR is enough right, just curious... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @susanxhuynh. Just a question about your doc string and testing.
ConfigBuilder("spark.mesos.network.labels") | ||
.doc("Network labels to pass to CNI plugins. This is a comma-separated list " + | ||
"of key-value pairs, where each key-value pair has the format key:value. " + | ||
"Example: key1=val1,key2=val2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we use :
to separate the key-value pairs? So it would say:
`Example: key1:val1,key2:val2
@@ -21,6 +21,7 @@ import org.apache.mesos.Protos.{ContainerInfo, Image, NetworkInfo, Parameter, Vo | |||
import org.apache.mesos.Protos.ContainerInfo.{DockerInfo, MesosInfo} | |||
|
|||
import org.apache.spark.{SparkConf, SparkException} | |||
import org.apache.spark.deploy.mesos.config._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this makes it more consistent with MesosClusterScheduler
@@ -162,7 +163,11 @@ private[mesos] object MesosSchedulerBackendUtil extends Logging { | |||
} | |||
|
|||
conf.getOption("spark.mesos.network.name").map { name => | |||
val info = NetworkInfo.newBuilder().setName(name).build() | |||
val networkLabels = MesosProtoUtils.mesosLabels(conf.get(NETWORK_LABELS).getOrElse("")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for making the NETWORK_LABELS
alias.
assert(networkInfos.get(0).getLabels.getLabels(0).getKey == "key1") | ||
assert(networkInfos.get(0).getLabels.getLabels(0).getValue == "val1") | ||
assert(networkInfos.get(0).getLabels.getLabels(1).getKey == "key2") | ||
assert(networkInfos.get(0).getLabels.getLabels(1).getValue == "val2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to test that the driver's worker tasks also get the labels? It's important that all of the ContainerInfo
s associated with a job (not jus the driver scheduler) get the labels for CNI to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArtRand I believe this is testing the worker tasks: (L.572) the setBackend()
method creates a MesosCoarseGrainedSchedulerBackend, L.577-579 creates the requirements for an Executor, (L. 580) backend.resourceOffers() looks for an offer that matches the requirements and launches an Executor task.
There is a separate unit test above, in MesosClusterSchedulerSuite.scala, which checks for the labels in the driver itself.
…'NETWORK_NAME' to config object.
Test build #80555 has finished for PR 18910 at commit
|
@skonto @ArtRand Thanks for the feedback. I have fixed the documentation and added NETWORK_NAME to the config object. Please let me know what you think. @skonto I have not tested this particular change on a real CNI network. I think the only difference is that a Spark job runs on a different network, but there's no change in the Spark functionality. |
Test build #80556 has finished for PR 18910 at commit
|
LGTM. |
@vanzin Would you mind helping review this PR? Thanks. |
@srowen Sean, would you like to review this PR? Thanks. |
@@ -70,4 +70,19 @@ package object config { | |||
"during a temporary disconnection, before tearing down all the executors.") | |||
.doubleConf | |||
.createWithDefault(0.0) | |||
|
|||
private [spark] val NETWORK_NAME = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit, could you remove the space after private
? that would make it consistent with the rest of the code.
I trust your judgment as to whether this is the right functional change for Mesos and the details and style look OK to me otherwise.
Test build #81051 has finished for PR 18910 at commit
|
@srowen Thanks for reviewing. I've removed those spaces; please take a look. |
Merged to master |
JIRA ticket: https://issues.apache.org/jira/browse/SPARK-21694 Spark already supports launching containers attached to a given CNI network by specifying it via the config `spark.mesos.network.name`. This PR adds support to pass in network labels to CNI plugins via a new config option `spark.mesos.network.labels`. These network labels are key-value pairs that are set in the `NetworkInfo` of both the driver and executor tasks. More details in the related Mesos documentation: http://mesos.apache.org/documentation/latest/cni/#mesos-meta-data-to-cni-plugins Unit tests, for both driver and executor tasks. Manual integration test to submit a job with the `spark.mesos.network.labels` option, hit the mesos/state.json endpoint, and check that the labels are set in the driver and executor tasks. ArtRand skonto Author: Susan X. Huynh <xhuynh@mesosphere.com> Closes apache#18910 from susanxhuynh/sh-mesos-cni-labels.
JIRA ticket: https://issues.apache.org/jira/browse/SPARK-21694
What changes were proposed in this pull request?
Spark already supports launching containers attached to a given CNI network by specifying it via the config
spark.mesos.network.name
.This PR adds support to pass in network labels to CNI plugins via a new config option
spark.mesos.network.labels
. These network labels are key-value pairs that are set in theNetworkInfo
of both the driver and executor tasks. More details in the related Mesos documentation: http://mesos.apache.org/documentation/latest/cni/#mesos-meta-data-to-cni-pluginsHow was this patch tested?
Unit tests, for both driver and executor tasks.
Manual integration test to submit a job with the
spark.mesos.network.labels
option, hit the mesos/state.json endpoint, and check that the labels are set in the driver and executor tasks.@ArtRand @skonto