Create a doc to describe the deployment process. #1159

jlewi · 2018-07-10T17:50:40Z

This is not a proposal but a description of the current state.

/assign @ankushagarwal
/assign @kunmingg

This change is

ankushagarwal · 2018-07-10T18:24:34Z

docs_dev/kubeflow_deployment.md

+
+1. We provide simple, platform deployment scripts like this [one for GKE](https://github.com/kubeflow/kubeflow/blob/master/docs/gke/configs/deploy.sh)
+
+1. A corresponding platform specific getting started page ([see here]](https://github.com/kubeflow/website/tree/master/content/docs/started) provides platform specific instructions


Typo near see here

ankushagarwal · 2018-07-11T04:37:28Z

docs_dev/kubeflow_deployment.md

+
+1. We provide simple, platform deployment scripts like this [one for GKE](https://github.com/kubeflow/kubeflow/blob/master/scripts/gke/deploy.sh)
+
+1. A corresponding platform specific getting started page ([see here]](https://github.com/kubeflow/website/tree/master/content/docs/started) provides platform specific instructions


Replace [see here]] with [see here]

pdmack · 2018-07-12T03:00:56Z

/retest

kkasravi · 2018-07-13T14:12:25Z

docs_dev/kubeflow_deployment.md

+
+The scaffolding/prototype for this (bootstrapper) is still in place and we haven't
+rejected this idea completely. So contributions pursuing this idea further would
+be welcome.


@jlewi I liked the design of the bootstrapper running in the 'kubeflow' namespace with elevated privileges. It did seem to have a limitation of only creating/deploying one kubeflow instance/namespace with no specific rbac mapping to users in that namespace. i wonder if this design might be continued with a local command communicating with the bootstrapper for create, delete, list, add <user/group>, and other subcommands. The bootstrapper would leverage the application as a bundling construct for different types of applications that have different users, cloud-providers and components.

See #1151 The bootstrapper is turning into a wrapper around ksonnet to support click to deploy. And moves it into the direction of being an app manager. (I hope providing a REST server in front of ks will eventually be officially part of ksonnet).

This is needed to support click to deploy which uses a javascript client side web app. So we can't run ks in that case.

We can also avoid providing elevated permissions via a service account and instead use a bearer token containing user credentials in the request. Ideally that credential could be scoped (e.g. in GCP don't want to pass a GCP credential with access to other resources).

I think we should avoid duplicating the functions provided by existing tools e.g. ks and kubectl.

I think the next piece of functionality that will go into it will be monitoring functionality see #1106.

kkasravi · 2018-07-13T14:25:21Z

docs_dev/kubeflow_deployment.md

+
+Here are some **guidelines (not requirements)** for creating the above scripts and instructions
+
+* Platform scripts should assume users are starting from scratch


IMO platform scripts which create the cluster and its node pools (eg deployment manager) should be separate from scripts which create/deploy a kubeflow instance which should also be separate from scripts which create the user/group/PV/PVCs.

See comment below about the scripts being linear. My thinking is that providing a single script optimizes for getting started "one command". If the scripts are linear it should be pretty straightforward.

If we can have a module approach where the scripts are split up into separate scripts that can then be called from a single uber script that might be nice too. That would probably help with code reuse as well. But I also don't want to make our scripts super complicated.

kkasravi · 2018-07-13T14:32:26Z

docs_dev/kubeflow_deployment.md

+opening more. 
+
+The current thinking is to follow the guidance of sig-apps and use an [application resource](https://github.com/kubernetes-sigs/application)
+to represent Kubeflow and attach events, status, and other metrics to that application as appropriate.


In addition to the application crd noted above - could the SecurityProfile also be leveraged?

Assuming you mean for security purposes (as opposed to monitoring) then I think so. It looks like SecurityProfile is still very early though and not even supported by sig-apps yet.

yes. It is early but it seems like it could be combined with application resource - where a UI could allow different components and security profiles to be bundled and deployed.

Is the bootstrapper assuming that only one kubeflow instance would be deployed or many? EG:

kubeflow-admin
above privileged namespace could deploy kubeflow 'instances' like below and have different directories (that could include gitops) for each deployment

kubeflow-teamX

kubeflow-datascientistY

kubeflow-pytorch-gpu-1

bootstrapper is becoming a server around ksonnet see #1151. So in this case the server can be used with manager different ksonnet applications and not just deploying kubeflow. Its primary function will be as a templating microservice e.g.

paramaters, ksonnetapp -> YAML manifests.

jlewi · 2018-07-18T03:01:30Z

@kkasravi Would you mind LGTM'ing since you're the only who provide significant feedback? We can continue to discuss/iterate but it would be good to get it checked in.

/assign @kkasravi

jlewi · 2018-07-21T01:28:05Z

/assign @kunmingg Can you LGTM this? Would be good to get it committed.

kkasravi · 2018-07-21T01:56:42Z

/lgtm

jlewi · 2018-07-22T20:59:06Z

/approve

k8s-ci-robot · 2018-07-22T20:59:08Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jlewi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pdmack · 2018-07-22T23:21:49Z

/retest

pdmack · 2018-07-23T12:35:19Z

/retest

jlewi · 2018-07-23T20:48:17Z

Rebase'd to pick up improvements to the tests.

kkasravi · 2018-07-25T06:03:04Z

/lgtm

jlewi · 2018-07-25T12:44:17Z

/retest

* This is not a proposal but a description of the current state.

lluunn · 2018-07-27T18:05:57Z

/lgtm

* This is not a proposal but a description of the current state.

k8s-ci-robot assigned ankushagarwal and kunmingg Jul 10, 2018

k8s-ci-robot requested review from wbuchwalter and willingc July 10, 2018 17:50

k8s-ci-robot added the size/L label Jul 10, 2018

ankushagarwal reviewed Jul 10, 2018

View reviewed changes

ankushagarwal reviewed Jul 11, 2018

View reviewed changes

kkasravi reviewed Jul 13, 2018

View reviewed changes

k8s-ci-robot assigned kkasravi Jul 18, 2018

k8s-ci-robot added the lgtm label Jul 21, 2018

k8s-ci-robot added the approved label Jul 22, 2018

jlewi force-pushed the deploy_doc branch from d329521 to 922209d Compare July 23, 2018 20:47

k8s-ci-robot removed the lgtm label Jul 23, 2018

k8s-ci-robot added the lgtm label Jul 25, 2018

jlewi mentioned this pull request Jul 25, 2018

[Test Flake] simple tf job failing; Job not found waiting for job #1266

Closed

Create a doc to describe the deployment process.

6d987e8

* This is not a proposal but a description of the current state.

jlewi force-pushed the deploy_doc branch from 922209d to 6d987e8 Compare July 27, 2018 12:53

k8s-ci-robot removed the lgtm label Jul 27, 2018

k8s-ci-robot assigned lluunn Jul 27, 2018

k8s-ci-robot added the lgtm label Jul 27, 2018

k8s-ci-robot merged commit 645db22 into kubeflow:master Jul 27, 2018

saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 11, 2021

Create a doc to describe the deployment process. (kubeflow#1159)

d9288c7

* This is not a proposal but a description of the current state.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a doc to describe the deployment process. #1159

Create a doc to describe the deployment process. #1159

jlewi commented Jul 10, 2018 •

edited

Loading

ankushagarwal Jul 10, 2018

ankushagarwal Jul 11, 2018

pdmack commented Jul 12, 2018

kkasravi Jul 13, 2018

jlewi Jul 13, 2018

kkasravi Jul 13, 2018

jlewi Jul 13, 2018

kkasravi Jul 13, 2018

jlewi Jul 16, 2018

kkasravi Jul 16, 2018

jlewi Jul 18, 2018

jlewi commented Jul 18, 2018

jlewi commented Jul 21, 2018

kkasravi commented Jul 21, 2018

jlewi commented Jul 22, 2018

k8s-ci-robot commented Jul 22, 2018

pdmack commented Jul 22, 2018

pdmack commented Jul 23, 2018

jlewi commented Jul 23, 2018

kkasravi commented Jul 25, 2018

jlewi commented Jul 25, 2018

lluunn commented Jul 27, 2018


		1. We provide simple, platform deployment scripts like this [one for GKE](https://github.com/kubeflow/kubeflow/blob/master/docs/gke/configs/deploy.sh)

		1. A corresponding platform specific getting started page ([see here]](https://github.com/kubeflow/website/tree/master/content/docs/started) provides platform specific instructions


		Here are some guidelines (not requirements) for creating the above scripts and instructions

		* Platform scripts should assume users are starting from scratch

Create a doc to describe the deployment process. #1159

Create a doc to describe the deployment process. #1159

Conversation

jlewi commented Jul 10, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pdmack commented Jul 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlewi commented Jul 18, 2018

jlewi commented Jul 21, 2018

kkasravi commented Jul 21, 2018

jlewi commented Jul 22, 2018

k8s-ci-robot commented Jul 22, 2018

pdmack commented Jul 22, 2018

pdmack commented Jul 23, 2018

jlewi commented Jul 23, 2018

kkasravi commented Jul 25, 2018

jlewi commented Jul 25, 2018

lluunn commented Jul 27, 2018

jlewi commented Jul 10, 2018 •

edited

Loading