Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21694][MESOS] Support Mesos CNI network labels #18910

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions docs/running-on-mesos.md
Original file line number Diff line number Diff line change
Expand Up @@ -537,6 +537,20 @@ See the [configuration page](configuration.html) for information on Spark config
for more details.
</td>
</tr>
<tr>
<td><code>spark.mesos.network.labels</code></td>
<td><code>(none)</code></td>
<td>
Pass network labels to CNI plugins. This is a comma-separated list
of key-value pairs, where each key-value pair has the format key:value.
Example:

<pre>key1:val1,key2:val2</pre>
See
<a href="http://mesos.apache.org/documentation/latest/cni/#mesos-meta-data-to-cni-plugins">the Mesos CNI docs</a>
for more details.
</td>
</tr>
<tr>
<td><code>spark.mesos.fetcherCache.enable</code></td>
<td><code>false</code></td>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,4 +70,19 @@ package object config {
"during a temporary disconnection, before tearing down all the executors.")
.doubleConf
.createWithDefault(0.0)

private [spark] val NETWORK_NAME =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit, could you remove the space after private? that would make it consistent with the rest of the code.

I trust your judgment as to whether this is the right functional change for Mesos and the details and style look OK to me otherwise.

ConfigBuilder("spark.mesos.network.name")
.doc("Attach containers to the given named network. If this job is launched " +
"in cluster mode, also launch the driver in the given named network.")
.stringConf
.createOptional

private [spark] val NETWORK_LABELS =
ConfigBuilder("spark.mesos.network.labels")
.doc("Network labels to pass to CNI plugins. This is a comma-separated list " +
"of key-value pairs, where each key-value pair has the format key:value. " +
"Example: key1:val1,key2:val2")
.stringConf
.createOptional
}
Original file line number Diff line number Diff line change
Expand Up @@ -668,7 +668,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
}

private def executorHostname(offer: Offer): String = {
if (sc.conf.getOption("spark.mesos.network.name").isDefined) {
if (sc.conf.get(NETWORK_NAME).isDefined) {
// The agent's IP is not visible in a CNI container, so we bind to 0.0.0.0
"0.0.0.0"
} else {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import org.apache.mesos.Protos.{ContainerInfo, Image, NetworkInfo, Parameter, Vo
import org.apache.mesos.Protos.ContainerInfo.{DockerInfo, MesosInfo}

import org.apache.spark.{SparkConf, SparkException}
import org.apache.spark.deploy.mesos.config.{NETWORK_LABELS, NETWORK_NAME}
import org.apache.spark.internal.Logging

/**
Expand Down Expand Up @@ -161,8 +162,12 @@ private[mesos] object MesosSchedulerBackendUtil extends Logging {
volumes.foreach(_.foreach(containerInfo.addVolumes(_)))
}

conf.getOption("spark.mesos.network.name").map { name =>
val info = NetworkInfo.newBuilder().setName(name).build()
conf.get(NETWORK_NAME).map { name =>
val networkLabels = MesosProtoUtils.mesosLabels(conf.get(NETWORK_LABELS).getOrElse(""))
Copy link
Contributor

@skonto skonto Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not part of this PR but...shouldn't spark.mesos.network.name have a config name like labels have (NETWORK_LABELS)?
At some point I think all mesos specific config options should be placed in the config file to be easily managed and referenced.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for making the NETWORK_LABELS alias.

val info = NetworkInfo.newBuilder()
.setName(name)
.setLabels(networkLabels)
.build()
containerInfo.addNetworkInfos(info)
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,7 @@ class MesosClusterSchedulerSuite extends SparkFunSuite with LocalSparkContext wi
assert(env.getOrElse("TEST_ENV", null) == "TEST_VAL")
}

test("supports spark.mesos.network.name") {
test("supports spark.mesos.network.name and spark.mesos.network.labels") {
setScheduler()

val mem = 1000
Expand All @@ -233,7 +233,8 @@ class MesosClusterSchedulerSuite extends SparkFunSuite with LocalSparkContext wi
command,
Map("spark.mesos.executor.home" -> "test",
"spark.app.name" -> "test",
"spark.mesos.network.name" -> "test-network-name"),
"spark.mesos.network.name" -> "test-network-name",
"spark.mesos.network.labels" -> "key1:val1,key2:val2"),
"s1",
new Date()))

Expand All @@ -246,6 +247,10 @@ class MesosClusterSchedulerSuite extends SparkFunSuite with LocalSparkContext wi
val networkInfos = launchedTasks.head.getContainer.getNetworkInfosList
assert(networkInfos.size == 1)
assert(networkInfos.get(0).getName == "test-network-name")
assert(networkInfos.get(0).getLabels.getLabels(0).getKey == "key1")
assert(networkInfos.get(0).getLabels.getLabels(0).getValue == "val1")
assert(networkInfos.get(0).getLabels.getLabels(1).getKey == "key2")
assert(networkInfos.get(0).getLabels.getLabels(1).getValue == "val2")
}

test("supports spark.mesos.driver.labels") {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -568,9 +568,10 @@ class MesosCoarseGrainedSchedulerBackendSuite extends SparkFunSuite
assert(launchedTasks.head.getLabels.equals(taskLabels))
}

test("mesos supports spark.mesos.network.name") {
test("mesos supports spark.mesos.network.name and spark.mesos.network.labels") {
setBackend(Map(
"spark.mesos.network.name" -> "test-network-name"
"spark.mesos.network.name" -> "test-network-name",
"spark.mesos.network.labels" -> "key1:val1,key2:val2"
))

val (mem, cpu) = (backend.executorMemory(sc), 4)
Expand All @@ -582,6 +583,10 @@ class MesosCoarseGrainedSchedulerBackendSuite extends SparkFunSuite
val networkInfos = launchedTasks.head.getContainer.getNetworkInfosList
assert(networkInfos.size == 1)
assert(networkInfos.get(0).getName == "test-network-name")
assert(networkInfos.get(0).getLabels.getLabels(0).getKey == "key1")
assert(networkInfos.get(0).getLabels.getLabels(0).getValue == "val1")
assert(networkInfos.get(0).getLabels.getLabels(1).getKey == "key2")
assert(networkInfos.get(0).getLabels.getLabels(1).getValue == "val2")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to test that the driver's worker tasks also get the labels? It's important that all of the ContainerInfos associated with a job (not jus the driver scheduler) get the labels for CNI to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArtRand I believe this is testing the worker tasks: (L.572) the setBackend() method creates a MesosCoarseGrainedSchedulerBackend, L.577-579 creates the requirements for an Executor, (L. 580) backend.resourceOffers() looks for an offer that matches the requirements and launches an Executor task.
There is a separate unit test above, in MesosClusterSchedulerSuite.scala, which checks for the labels in the driver itself.

}

test("supports spark.scheduler.minRegisteredResourcesRatio") {
Expand Down