Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Support providing the driver URI from an external controller #140

Closed
mccheah opened this issue Feb 23, 2017 · 13 comments
Closed

Support providing the driver URI from an external controller #140

mccheah opened this issue Feb 23, 2017 · 13 comments
Labels

Comments

@mccheah
Copy link

mccheah commented Feb 23, 2017

Inspired by #70, a use case @ash211 and I are working with requires the driver to be accessed in a manner other than NodePort. We proposed using an Ingress resource at first, but this primitive is not quite the right tool for the situation.

Instead, from that ticket, we came up with the following solution:

  • Expose a Spark boolean configuration called spark.kubernetes.driver.useExternalUriProvider
  • If the above configuration is set to false, submit the driver using the NodePort service as currently implemented
  • Otherwise, still create the service and pod but place an annotation on the service - say spark/provideExternalIp. The service would be created with type ClusterIP.
  • Some external component that watches the API server can listen to the creation of services created with the annotation spark/provideExternalUri. If such a service is created, this external component would be expected to patch the service with another annotation: spark/resolvedExternalUri. The value of this annotation is the external URI that should be used by the client to route the application submission request to the driver pod.
  • The submission client waits for the service to have the given annotation be filled in, and uses the value of that annotation as the external URI to route to the driver.

This can all be done without any dependency on the third party resource. The user does have to provide their own implementation of the external component, but in this case the API for the driver to receive an externally-routable URI is well-defined.

@ash211
Copy link

ash211 commented Feb 23, 2017

@mccheah instead of an external IP do you mean an external URI? So change the labels to spark/provideExternalUri and spark/resolvedExternalUri.

Additionally I suspect that there's a k8s scheme to labels that we should be following. Something like alpha.spark.kubernetes.io/provideExternalUri. @foxish are you aware of docs related to label schemes?

@mccheah
Copy link
Author

mccheah commented Feb 23, 2017

Another tricky thing is that there's two URIs that need to be resolved - the Spark UI and the driver submission port. Perhaps the one annotation should be the base path of both of these components and then we fill in the rest of the path consistently, maybe with /spark-ui and /spark-driver-submit?

@foxish
Copy link
Member

foxish commented Feb 23, 2017

There is a discussion about convention here: kubernetes/kubernetes#30822, but I don't know of any docs.
As per the common pattern I see, we can go with spark-job.alpha.apache.org/field-name

@foxish
Copy link
Member

foxish commented Feb 23, 2017

@mccheah Do the two URLs have to be related necessarily? I can imagine people wanting to keep them separate, perhaps one serving over the apiserver proxy and the other using some custom ingress based solution.

@mccheah
Copy link
Author

mccheah commented Feb 23, 2017

They do not necessarily have to be related - but the user would have to provide two annotations, and the driver submission specific URI seems to be an implementation detail I suppose.

@mccheah
Copy link
Author

mccheah commented Feb 23, 2017

Although I suppose for the driver submission, it only strictly needs the driver submission base path. So we can make the contract only require this annotation but recommend in the docs that the user also configure a route to the Spark UI.

@foxish
Copy link
Member

foxish commented Feb 23, 2017

Yeah. The submission URI is an implementation detail which a user shouldn't need access to. The UI port is part of the status, so, it should ideally live in the SparkJob TPR which lets us expose it to the user.

@mccheah
Copy link
Author

mccheah commented Feb 24, 2017

@foxish there's another option here where we can provide a Java interface that can be service-loaded to provide the Service's external URI. We could make the default and only implementation we ship with as being the NodePort approach as we do currently. A custom implementation could for example use Ingress resources as was done in #70. Thoughts?

@mccheah
Copy link
Author

mccheah commented Feb 24, 2017

If we think that the original model outlined above is a useful general implementation as well then we can also ship with that.

@foxish
Copy link
Member

foxish commented Feb 24, 2017

@foxish there's another option here where we can provide a Java interface that can be service-loaded to provide the Service's external URI. We could make the default and only implementation we ship with as being the NodePort approach as we do currently. A custom implementation could for example use Ingress resources as was done in #70. Thoughts?

That seems fine to me. Is there some advantage of using the service-loader approach over the annotation?

@ash211
Copy link

ash211 commented Feb 24, 2017

It can be done entirely in-process so doesn't require the setup/maintenance/resources of the controller that responds to the annotation in an environment-specific way.

@foxish
Copy link
Member

foxish commented Feb 24, 2017

I see. Your particular implementation would create an ingress rule, wait for its readiness and then return that I imagine. That "plugin" could even be published separately? That sounds fine to me and it seems like it would simplify your use case. Do you envision the service-loader mechanism coexisting with the annotation? The existence of the annotation makes sense to me because it is an extension point that doesn't require knowledge of scala, or the k8s client abstractions to use.

@ash211
Copy link

ash211 commented Feb 24, 2017

The service-loader and the annotation-based controller approaches can exist independently -- we're more interested in the first but can create both for greater flexibility. The second is already on its way in #147

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants