Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Bumping up kubernetes-client version to fix GKE and local proxy #105

Merged
merged 3 commits into from
Feb 10, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 29 additions & 1 deletion docs/running-on-kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ connect without SSL on a different port, the master would be set to `k8s://http:

Note that applications can currently only be executed in cluster mode, where the driver and its executors are running on
the cluster.

### Adding Other JARs

Spark allows users to provide dependencies that are bundled into the driver's Docker image, or that are on the local
Expand Down Expand Up @@ -150,6 +150,34 @@ or `container:`. A scheme of `file:` corresponds to the keyStore being located o
the driver container as a [secret volume](https://kubernetes.io/docs/user-guide/secrets/). When the URI has the scheme
`container:`, the file is assumed to already be on the container's disk at the appropriate path.

### Kubernetes Clusters and the authenticated proxy endpoint

Spark-submit also supports submission through the
[local kubectl proxy](https://kubernetes.io/docs/user-guide/connecting-to-applications-proxy/). One can use the
authenticating proxy to communicate with the api server directly without passing credentials to spark-submit.

The local proxy can be started by running:

kubectl proxy

If our local proxy were listening on port 8001, we would have our submission looking like the following:

bin/spark-submit \
--deploy-mode cluster \
--class org.apache.spark.examples.SparkPi \
--master k8s://http://127.0.0.1:8001 \
--kubernetes-namespace default \
--conf spark.executor.instances=5 \
--conf spark.app.name=spark-pi \
--conf spark.kubernetes.driver.docker.image=registry-host:5000/spark-driver:latest \
--conf spark.kubernetes.executor.docker.image=registry-host:5000/spark-executor:latest \
examples/jars/spark_examples_2.11-2.2.0.jar

Communication between Spark and Kubernetes clusters is performed using the fabric8 kubernetes-client library.
The above mechanism using `kubectl proxy` can be used when we have authentication providers that the fabric8
kubernetes-client library does not support. Authentication using X509 Client Certs and oauth tokens
is currently supported.

### Spark Properties

Below are some other common properties that are specific to Kubernetes. Most of the other configurations are the same
Expand Down
2 changes: 1 addition & 1 deletion resource-managers/kubernetes/core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
<name>Spark Project Kubernetes</name>
<properties>
<sbt.project.name>kubernetes</sbt.project.name>
<kubernetes.client.version>1.4.34</kubernetes.client.version>
<kubernetes.client.version>2.0.3</kubernetes.client.version>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this crosses a 2.0 boundary -- any backcompat breaks to be worried about?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are release notes for this library? I'm having trouble finding them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None as far as I've checked. It looks like they just chose a sudden version jump a few days ago: https://github.com/fabric8io/kubernetes-client/releases?after=kubernetes-client-2.0.0.fuse-000002

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some packages moved around related to Jobs, but we don't use those so it didn't affect us.

fabric8io/kubernetes-client#646

</properties>

<dependencies>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ private[spark] class KubernetesClusterSchedulerBackend(
private val EXECUTOR_MODIFICATION_LOCK = new Object
private val runningExecutorPods = new scala.collection.mutable.HashMap[String, Pod]

private val kubernetesMaster = Client.resolveK8sMaster(sc.master)
private val kubernetesMaster = "https://kubernetes"
private val executorDockerImage = conf.get(EXECUTOR_DOCKER_IMAGE)
private val kubernetesNamespace = conf.get(KUBERNETES_NAMESPACE)
private val executorPort = conf.getInt("spark.executor.port", DEFAULT_STATIC_PORT)
Expand Down