Update docs for connecting to Kubeflow Pipelines from the same cluste…

…r in multi-user mode (#2905) * Update docs for connecting to Kubeflow Pipelines from the same cluster * Update content/en/docs/components/pipelines/sdk/connect-api.md Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * Improve in-cluster access to KFP API documentation * Update documentation Co-authored-by: Bart <bartlomiej.grasza@gmail.com> Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
kubeflow · Oct 11, 2021 · 4c74471 · 4c74471
1 parent fad6acc
commit 4c74471
Show file tree

Hide file tree

Showing 2 changed files with 136 additions and 15 deletions.
diff --git a/content/en/docs/components/pipelines/multi-user.md b/content/en/docs/components/pipelines/multi-user.md
@@ -167,19 +167,9 @@ without access control:
 * Artifacts, Executions, and other metadata entities in [Machine Learning Metadata (MLMD)](https://www.tensorflow.org/tfx/guide/mlmd).
 * [Minio artifact storage](https://min.io/) which contains pipeline runs' input/output artifacts.
 
-### In-cluster API request authentication
+## In-cluster API request authentication
 
-Clients can only access the Kubeflow Pipelines API from the public endpoint
-that enforces authentication.
+Refer to [Connect to Kubeflow Pipelines from the same cluster](/docs/components/pipelines/sdk/connect-api/#connect-to-kubeflow-pipelines-from-the-same-cluster) for details.
 
-In-cluster direct access to the API endpoint is denied by Istio authorization
-policies, because there's no secure way to authenticate in-cluster requests to
-the Kubeflow Pipelines API server yet.
-
-If you need to access the API endpoint from in-cluster workload like Jupyter
-notebooks or cron tasks, current suggested workaround is to connect through
-public endpoint and follow platform specific documentation to authenticate
-programmatically using user credentials. For Google Cloud, you can refer to
-[Connecting to Kubeflow Pipelines in a full Kubeflow deployment on Google Cloud](/docs/gke/pipelines/authentication-sdk/#connecting-to-kubeflow-pipelines-in-a-full-kubeflow-deployment).
-
-There is work-in-progress to support this use-case, refer to [github issue #5138](https://github.com/kubeflow/pipelines/issues/5138).
+Alternatively, in-cluster workloads like Jupyter notebooks or cron tasks can also access Kubeflow Pipelines API through the public endpoint. This option is platform specific and explained in 
+[Connect to Kubeflow Pipelines from outside your cluster](/docs/components/pipelines/sdk/connect-api/#connect-to-kubeflow-pipelines-from-outside-your-cluster).
diff --git a/content/en/docs/components/pipelines/sdk/connect-api.md b/content/en/docs/components/pipelines/sdk/connect-api.md
@@ -53,7 +53,7 @@ because it requires authentication. Refer to distribution specific documentation
 
 ### Connect to Kubeflow Pipelines from the same cluster
 
-Note, this is not supported right now for multi-user Kubeflow Pipelines, refer to [Multi-User Isolation for Pipelines -- Current Limitations](/docs/components/pipelines/multi-user/#current-limitations).
+#### Non-multi-user mode
 
 As mentioned above, the Kubeflow Pipelines API Kubernetes service is `ml-pipeline-ui`.
 
@@ -85,6 +85,137 @@ client = kfp.Client(host=f'http://ml-pipeline-ui.{namespace}:80')
 print(client.list_experiments())
 ```
 
+#### Multi-User mode
+
+Note, multi-user mode technical details were put in the [How in-cluster authentication works](#how-in-cluster-authentication-works) section below.
+
+Choose your use-case from one of the options below:
+
+* **Access Kubeflow Pipelines from Jupyter notebook**
+
+  In order to **access Kubeflow Pipelines from Jupyter notebook**, an additional per namespace (profile) manifest is required:
+
+  ```yaml
+  apiVersion: kubeflow.org/v1alpha1
+  kind: PodDefault
+  metadata:
+    name: access-ml-pipeline
+    namespace: "<YOUR_USER_PROFILE_NAMESPACE>"
+  spec:
+    desc: Allow access to Kubeflow Pipelines
+    selector:
+      matchLabels:
+        access-ml-pipeline: "true"
+    volumes:
+      - name: volume-kf-pipeline-token
+        projected:
+          sources:
+            - serviceAccountToken:
+                path: token
+                expirationSeconds: 7200
+                audience: pipelines.kubeflow.org      
+    volumeMounts:
+      - mountPath: /var/run/secrets/kubeflow/pipelines
+        name: volume-kf-pipeline-token
+        readOnly: true
+    env:
+      - name: KF_PIPELINES_SA_TOKEN_PATH
+        value: /var/run/secrets/kubeflow/pipelines/token
+  ```
+
+  After the manifest is applied, newly created Jupyter notebook contains an additional option in the **configurations** section.
+  Read more about **configurations** in the [Jupyter notebook server](/docs/components/notebooks/setup/#create-a-jupyter-notebook-server-and-add-a-notebook).
+
+  Note, Kubeflow `kfp.Client` expects token either in `KF_PIPELINES_SA_TOKEN_PATH` environment variable or 
+  mounted to `/var/run/secrets/kubeflow/pipelines/token`. Do not change these values in the manifest. 
+  Similarly, `audience` should not be modified as well. No additional setup is required to refresh tokens.
+
+  Remember the setup has to be repeated per each namespace (profile) that should have access to Kubeflow Pipelines API from within Jupyter notebook.
+
+* **Access Kubeflow Pipelines from within any Pod**
+
+  In this case, the configuration is almost similar to the Jupyter Notebook case described above. 
+  The Pod manifest has to be extended with projected volume and mounted into either 
+  `KF_PIPELINES_SA_TOKEN_PATH` or `/var/run/secrets/kubeflow/pipelines/token`. 
+
+  Manifest below shows example Pod with token mounted into `/var/run/secrets/kubeflow/pipelines/token`:
+
+  ```yaml
+  apiVersion: v1
+  kind: Pod
+  metadata:
+    name: access-kfp-example
+    namespace: my-namespace
+  spec:
+    containers:
+    - image: my-image:latext 
+      name: access-kfp-example
+      volumeMounts:
+        - mountPath: /var/run/secrets/kubeflow/pipelines
+          name: volume-kf-pipeline-token
+          readOnly: true
+    volumes:
+      - name: volume-kf-pipeline-token
+        projected:
+          sources:
+            - serviceAccountToken:
+                path: token
+                expirationSeconds: 7200
+                audience: pipelines.kubeflow.org      
+  ```
+
+##### Managing cross-namespaces access to Kubeflow Pipelines API
+
+As already mentioned, access to Kubeflow Pipelines API requires per namespace setup.
+Alternatively, you can configure the access in a single namespace and allow other
+namespaces to access Kubeflow Pipelines API through it.
+
+Note, the examples below assume that `namespace-1` is a namespace (profile) that will be granted access to Kubeflow Pipelines API 
+through the `namespace-2` namespace. The `namespace-2` should already be configured to access Kubeflow Pipelines API.
+
+Cross-namespace access can be achieved in two ways:
+
+* **With additional RBAC settings.**
+
+  This option requires that only `namespace-2` has to have `PodDefault` manifest configured.
+
+  Access is granted by giving `namespace-1:ServiceAccount/default-editor` the `ClusterRole/kubeflow-edit` in `namespace-2`:
+
+  ```
+  apiVersion: rbac.authorization.k8s.io/v1
+  kind: RoleBinding
+  metadata:
+    name: kubeflow-edit-namespace-1
+    namespace: namespace-2
+  roleRef:
+    apiGroup: rbac.authorization.k8s.io
+    kind: ClusterRole
+    name: kubeflow-edit
+  subjects:
+  - kind: ServiceAccount
+    name: default-editor
+    namespace: namespace-1
+  ```
+
+* **By sharing access to the other profile.**
+
+  In this scenario, access is granted by `namespace-2` adding `namespace-1` as a  
+  [contributor](https://www.kubeflow.org/docs/components/multi-tenancy/getting-started/#managing-contributors-through-the-kubeflow-ui). 
+  Specifically, the owner of the `namespace-2` uses Kubeflow UI "Manage contributors" page. In the "Contributors to your namespace" 
+  textbox he adds email address associated with the `namespace-1`.
+
+##### How Multi-User mode in-cluster authentication works
+
+Authentication uses ServiceAccountToken 
+[projection](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection). 
+Simply put, the token is first being injected into a Pod (e.g. Jupyter notebook's). 
+Then Kubeflow Pipelines SDK uses this token to authorize against Kubeflow Pipelines API.
+
+It is important to understand that `serviceAccountToken` method respects the Kubeflow Pipelines RBAC, 
+and does not allow access beyond what the ServiceAcount running the notebook Pod has.
+
+More details about `PodDefault` can be found [here](https://github.com/kubeflow/kubeflow/blob/master/components/admission-webhook/README.md).
+
 ## Configure SDK client by environment variables
 
 It's usually beneficial to configure the Kubeflow Pipelines SDK client using Kubeflow Pipelines environment variables,