diff --git a/content/en/docs/components/pipelines/multi-user.md b/content/en/docs/components/pipelines/multi-user.md index f9c2c009e5..74326b64c1 100644 --- a/content/en/docs/components/pipelines/multi-user.md +++ b/content/en/docs/components/pipelines/multi-user.md @@ -167,19 +167,9 @@ without access control: * Artifacts, Executions, and other metadata entities in [Machine Learning Metadata (MLMD)](https://www.tensorflow.org/tfx/guide/mlmd). * [Minio artifact storage](https://min.io/) which contains pipeline runs' input/output artifacts. -### In-cluster API request authentication +## In-cluster API request authentication -Clients can only access the Kubeflow Pipelines API from the public endpoint -that enforces authentication. +Refer to [Connect to Kubeflow Pipelines from the same cluster](/docs/components/pipelines/sdk/connect-api/#connect-to-kubeflow-pipelines-from-the-same-cluster) for details. -In-cluster direct access to the API endpoint is denied by Istio authorization -policies, because there's no secure way to authenticate in-cluster requests to -the Kubeflow Pipelines API server yet. - -If you need to access the API endpoint from in-cluster workload like Jupyter -notebooks or cron tasks, current suggested workaround is to connect through -public endpoint and follow platform specific documentation to authenticate -programmatically using user credentials. For Google Cloud, you can refer to -[Connecting to Kubeflow Pipelines in a full Kubeflow deployment on Google Cloud](/docs/gke/pipelines/authentication-sdk/#connecting-to-kubeflow-pipelines-in-a-full-kubeflow-deployment). - -There is work-in-progress to support this use-case, refer to [github issue #5138](https://github.com/kubeflow/pipelines/issues/5138). +Alternatively, in-cluster workloads like Jupyter notebooks or cron tasks can also access Kubeflow Pipelines API through the public endpoint. This option is platform specific and explained in +[Connect to Kubeflow Pipelines from outside your cluster](/docs/components/pipelines/sdk/connect-api/#connect-to-kubeflow-pipelines-from-outside-your-cluster). diff --git a/content/en/docs/components/pipelines/sdk/connect-api.md b/content/en/docs/components/pipelines/sdk/connect-api.md index 8c7758e329..43bd3a3418 100644 --- a/content/en/docs/components/pipelines/sdk/connect-api.md +++ b/content/en/docs/components/pipelines/sdk/connect-api.md @@ -53,7 +53,7 @@ because it requires authentication. Refer to distribution specific documentation ### Connect to Kubeflow Pipelines from the same cluster -Note, this is not supported right now for multi-user Kubeflow Pipelines, refer to [Multi-User Isolation for Pipelines -- Current Limitations](/docs/components/pipelines/multi-user/#current-limitations). +#### Non-multi-user mode As mentioned above, the Kubeflow Pipelines API Kubernetes service is `ml-pipeline-ui`. @@ -85,6 +85,137 @@ client = kfp.Client(host=f'http://ml-pipeline-ui.{namespace}:80') print(client.list_experiments()) ``` +#### Multi-User mode + +Note, multi-user mode technical details were put in the [How in-cluster authentication works](#how-in-cluster-authentication-works) section below. + +Choose your use-case from one of the options below: + +* **Access Kubeflow Pipelines from Jupyter notebook** + + In order to **access Kubeflow Pipelines from Jupyter notebook**, an additional per namespace (profile) manifest is required: + + ```yaml + apiVersion: kubeflow.org/v1alpha1 + kind: PodDefault + metadata: + name: access-ml-pipeline + namespace: "" + spec: + desc: Allow access to Kubeflow Pipelines + selector: + matchLabels: + access-ml-pipeline: "true" + volumes: + - name: volume-kf-pipeline-token + projected: + sources: + - serviceAccountToken: + path: token + expirationSeconds: 7200 + audience: pipelines.kubeflow.org + volumeMounts: + - mountPath: /var/run/secrets/kubeflow/pipelines + name: volume-kf-pipeline-token + readOnly: true + env: + - name: KF_PIPELINES_SA_TOKEN_PATH + value: /var/run/secrets/kubeflow/pipelines/token + ``` + + After the manifest is applied, newly created Jupyter notebook contains an additional option in the **configurations** section. + Read more about **configurations** in the [Jupyter notebook server](/docs/components/notebooks/setup/#create-a-jupyter-notebook-server-and-add-a-notebook). + + Note, Kubeflow `kfp.Client` expects token either in `KF_PIPELINES_SA_TOKEN_PATH` environment variable or + mounted to `/var/run/secrets/kubeflow/pipelines/token`. Do not change these values in the manifest. + Similarly, `audience` should not be modified as well. No additional setup is required to refresh tokens. + + Remember the setup has to be repeated per each namespace (profile) that should have access to Kubeflow Pipelines API from within Jupyter notebook. + +* **Access Kubeflow Pipelines from within any Pod** + + In this case, the configuration is almost similar to the Jupyter Notebook case described above. + The Pod manifest has to be extended with projected volume and mounted into either + `KF_PIPELINES_SA_TOKEN_PATH` or `/var/run/secrets/kubeflow/pipelines/token`. + + Manifest below shows example Pod with token mounted into `/var/run/secrets/kubeflow/pipelines/token`: + + ```yaml + apiVersion: v1 + kind: Pod + metadata: + name: access-kfp-example + namespace: my-namespace + spec: + containers: + - image: my-image:latext + name: access-kfp-example + volumeMounts: + - mountPath: /var/run/secrets/kubeflow/pipelines + name: volume-kf-pipeline-token + readOnly: true + volumes: + - name: volume-kf-pipeline-token + projected: + sources: + - serviceAccountToken: + path: token + expirationSeconds: 7200 + audience: pipelines.kubeflow.org + ``` + +##### Managing cross-namespaces access to Kubeflow Pipelines API + +As already mentioned, access to Kubeflow Pipelines API requires per namespace setup. +Alternatively, you can configure the access in a single namespace and allow other +namespaces to access Kubeflow Pipelines API through it. + +Note, the examples below assume that `namespace-1` is a namespace (profile) that will be granted access to Kubeflow Pipelines API +through the `namespace-2` namespace. The `namespace-2` should already be configured to access Kubeflow Pipelines API. + +Cross-namespace access can be achieved in two ways: + +* **With additional RBAC settings.** + + This option requires that only `namespace-2` has to have `PodDefault` manifest configured. + + Access is granted by giving `namespace-1:ServiceAccount/default-editor` the `ClusterRole/kubeflow-edit` in `namespace-2`: + + ``` + apiVersion: rbac.authorization.k8s.io/v1 + kind: RoleBinding + metadata: + name: kubeflow-edit-namespace-1 + namespace: namespace-2 + roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: kubeflow-edit + subjects: + - kind: ServiceAccount + name: default-editor + namespace: namespace-1 + ``` + +* **By sharing access to the other profile.** + + In this scenario, access is granted by `namespace-2` adding `namespace-1` as a + [contributor](https://www.kubeflow.org/docs/components/multi-tenancy/getting-started/#managing-contributors-through-the-kubeflow-ui). + Specifically, the owner of the `namespace-2` uses Kubeflow UI "Manage contributors" page. In the "Contributors to your namespace" + textbox he adds email address associated with the `namespace-1`. + +##### How Multi-User mode in-cluster authentication works + +Authentication uses ServiceAccountToken +[projection](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection). +Simply put, the token is first being injected into a Pod (e.g. Jupyter notebook's). +Then Kubeflow Pipelines SDK uses this token to authorize against Kubeflow Pipelines API. + +It is important to understand that `serviceAccountToken` method respects the Kubeflow Pipelines RBAC, +and does not allow access beyond what the ServiceAcount running the notebook Pod has. + +More details about `PodDefault` can be found [here](https://github.com/kubeflow/kubeflow/blob/master/components/admission-webhook/README.md). + ## Configure SDK client by environment variables It's usually beneficial to configure the Kubeflow Pipelines SDK client using Kubeflow Pipelines environment variables,