Expose Pipelines as CRD and enable to easy migration from Argo workflow #1132

inc0 · 2019-04-10T20:16:34Z

Currently to use pipelines you need to run python SDK, only that generates argo workflow underneath etc. I think this is very limiting because:

not everyone uses python
requires to learn whole new API and DSL
Argo has a lot of examples already, shame that we can't tap to this knowledge source
Argo can do much more than just data pipelines, you can learn one syntax and have it used for data, CI, CD etc

I propose creating new CRD that will be effectively Argo workflow with additional options.
For example

apiVersion: kubeflow.org
kind: Pipeline
metadata:
  generateName:  mlapp-
  labels:
    workflow: mlapp
spec:
# Add some useful pipeline specific data
  model_name: foobar
  model_version: 1
# This is just argo workflow spec
  entrypoint: mlapp
  templates:
  - name: mlapp
    dag:
      tasks:
      - name: preprocess
        template: preprocess

      - name: model1
        dependencies: [preprocess]
        template: train
        arguments:
          artifacts:
          - name: dataset
            from: "{{tasks.preprocess.outputs.artifacts.dataset}}"

  - name: preprocess
    container:
      image: myimage:latest
      name: preprocess
      command: ["python", "/src/preprocess.py"]
      env:
        - name: SOMEENV
          value: foobar
    outputs:
     artifacts:
     - name: dataset
       path: /data

  - name: train
    inputs:
      artifacts:
      - name: dataset
        path: /data
    outputs:
     artifacts:
     - name: model
       path: /output
    container:
      image: myimage:latest
      name: trainer
      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 GPU
      command: ["python", "/src/train.py"]

This would make transition to pipelines much easier as Operators are already well known pattern and it handles a lot of things for us, including RBAC multitenancy, API auth etc.

The text was updated successfully, but these errors were encountered:

Ark-kun · 2019-04-10T21:23:26Z

I'm not sure this is needed.
Currently KF Pipelines use Argo Workflow CRD without changes. Pipelines do not extend it - there are no extra pipeline-specific fields.

If we decide to replace Argo, then we'll create a new CRD.

not everyone uses python
requires to learn whole new API and DSL

I do not think KF Pipelines requires you to do that.
Pipelines Python SDK just allows some people to write

preprocess = load_component(...)
train = load_component(...)

@pipeline
def mlapp():
    train(preprocess(train_set).output)

instead of writing the YAML manually.

inc0 · 2019-04-11T04:24:38Z

So, if I'd submit argo workflow, it will be picked up by pipelines immediatly? How, for example, will it save metrics?

vicaire · 2019-04-11T20:19:08Z

Hi inc0@, having a CRD for pipeline is being considered. We are planing to implement this in multiple steps:

First, we will create a pipeline spec that will combine and Argo workflow + additional data needed for ML pipelines.
Initially, this spec will be processed by the pipeline API server and turned into an Argo workflow.
Later on, we could turn this pipeline spec into a standalone CRD.
The long term expectation is that the pipeline CRD will let us combine multiple orchestration CRDs useful for ML (Argo workflow, HP tuning, etc.) and let users specify additional, optional, ML metadata.

Ark-kun · 2019-04-12T00:31:53Z

How, for example, will it save metrics?

To provide metrics the workflow task must have an output artifact called 'mlpipeline-metrics'.

So, if I'd submit argo workflow, it will be picked up by pipelines immediatly?

You have to submit the workflow against the pipelines API.
You can use either python client (kfp.Client(...).run_pipeline(...)) or CLI.
https://github.com/kubeflow/pipelines/tree/master/backend/src/cmd/ml

Note that it's not considered a supported mode of operation. It may break in future.

vicaire · 2019-04-12T21:21:07Z

@Ark-kun, having a CRD for pipeline is something that we are considering. Let's please keep this open.

yanniszark · 2019-08-12T16:06:58Z

Adding to this, having a Pipelines CRD would also provide a path for multi-user pipelines, as Kubernetes CRDs have built-in authentication and authorization via the API Server, like any other Kubernetes Object.
As such, maybe there is some overlap with #1223

stale · 2020-06-25T22:59:39Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Bobgy · 2020-06-26T01:19:53Z

/lifecycle frozen

I think this is something we'd want to consider for the long term.

alexlatchford · 2021-02-02T17:59:03Z

Chiming in here, more background in this Slack thread.

Our use case at Zillow is to be able to deploy monitoring alongside scheduled pipelines. We use Datadog internally and have created a K8s operator for creating Datadog Monitors (essentially alerts triggered by metrics over thresholds), it just reconciles the state of the resources with teh Datadog API.

We would like to be able to use a standard kubectl apply (or better a kubectl apply -k with kustomize) to deploy both a ScheduledWorkflow CRD, see these samples alongside these custom DatadogMonitor CRD resources. This is an extensible pattern and in teh future we are planning to produce a Datadog Dashboards operator so we could dynamically create dashboards on a per ScheduledWorkflow basis (useful for defining and monitoring SLOs for instance).

This would also allow us to unify our CICD pipeline with KFServing. Essentially we have the same pattern where we generate a set of resource manifests using kustomize and in that case it's an InferenceService + a set of DatadogMonitors. As we have an underlying core K8s team they already have CICD pipelines for running kubectl apply -k super easily internally so instead of the custom CICD pipelines we need to maintain atop the kfp CLI/SDK tooling the current public interfaces KFP exposes this would allow us to align wholly with the rest of our company reducing maintenance overheads!

Bobgy · 2021-02-03T05:16:37Z

@alexlatchford for clarification, does the use case only applies to ScheduledWorkflow?

Sounds to me one time pipeline runs do not need a CRD interface.

alexlatchford · 2021-02-03T16:16:27Z

I think we'd ideally prefer to just use the same CICD pipeline regardless so I'd imagine we'd use the ScheduledWorkflow in this mode just to unify the deployment process.

rubenaranamorera · 2021-08-23T11:52:33Z

Is this something that is still possible? it would be nice to have pipeline CRDs to be able to integrate pipelines with GitOps without loosing all UI capabilities

kujon · 2021-10-06T13:37:04Z

First, we will create a pipeline spec that will combine and Argo workflow + additional data needed for ML pipelines.

Initially, this spec will be processed by the pipeline API server and turned into an Argo workflow.

Later on, we could turn this pipeline spec into a standalone CRD.

The long term expectation is that the pipeline CRD will let us combine multiple orchestration CRDs useful for ML (Argo workflow, HP tuning, etc.) and let users specify additional, optional, ML metadata.

@vicaire as I understand steps 1 + 2 have been completed, are there still plans to introduce a standalone CRD? Having to rely on Python SDK and submitting files to Kubeflow API instead of Kubernetes API makes Kubeflow a really hard sell. In our case, dedicated CI/CD workflows need to be developed, we can't rely on any of the tooling (e.g. helm-secrets) that works virtually with any other thing deployed onto Kubernetes too.

chensun · 2023-01-18T07:50:22Z

Currently, there's no plan to make pipeline a CRD. In fact, we are moving to make pipeline platform-agnostic.

laurence-hudson-mindfoundry · 2023-08-09T11:21:52Z

I have the same use case as kujon, rubenaranamorera & alexlatchford . We deploy things using a Flux based GitOps workflow. The lack of on option to declaratively define kubeflow pipelines as kubernetes resource objects that can be kubectl apply'ed is a pain, and seems like a departure from K8s norms. It also seems inconsistent with other KF components like Kserve, were your have InferenceService resource objects etc.

…l agent (kubeflow#1132) * rebase master branch and fix build error fix agent injector unit test add finalizer to trainedmodel to remove model from the configmap before deleting the trainedmodel change the kfserving sdk to use v1alpha2 address pr comment, fix naming and fix configmap mount add model agent annotations in predictor reconciler instead of the ksvc reconciler cmd/agent/main.go create an empty model.json file when init the model config fix storage.py to support mms model dir /mnt/models add a done channel to block watcher.Start() function so the agent container won't exit under common condition multi-model server don't need to download models from external storage during runtime mount model dir in model server container fix minio mock docker tag and api version fix the predictor so the podspec won't be overwritten Inject S3 credential to agent watch models.json file instead of the ..data folder delete trainedmodel if its parent inferenceservice does not exists make if agent should access S3 service using virtual bucket configurable fix a bug that the agent unload unchanged models fix unit tests and format add a build template for agent remove model server patch only create multiModelConfigMap name instead of an empty configmap in predictor reconciler rebase master to get the kubernetes 1.18 support use default namespace to run unit tests for trainedmodel * rebase master to use aws for ci * do not inject S3UseVirtualBucket env by default since it is not a standard s3 env * fix logging * remove setting resync period to 60s * set trainedmodel's owner reference * fix unit test * add ownerreference properly * print controller logs to debug e2e * remove extra comma * only create model config for InferenceService whose storageUri is nil * remove unnecessary log * clean up test dir after e2e test * keep argo log for 1 day * only reconcile model configmap when storageUri is nil * add finalizer in InferenceService controller so it will clean up related trainmodels when an InferenceService is deleted * debug e2e test * add a step after running e2e test to print controller logs * fmt code * remove the extra copy-artifact step * invode the e2e test post process in exit handler * log inferenceservice external resource clean up * make post-e2e-tests.sh executable * reduce CPU usage for batcher and explainer e2e test * change e2e test parallelism to 3 instead of 4 * create 3 nodes for CI * reduce resource required by aix explainer e2e test * only create model configmap when InferenceService storageUri is nil * downgrade trainedmodel to v1alpha1 * create 4 nodes for CI * add extra check for custom predictor * fix agent injection check * Fix MMS model config check * only inject agent and reconcile model config for sklearn and xgboost predictor * Remove model server patches Co-authored-by: Dan Sun <dsun20@bloomberg.net>

For k8s 1.25, a securityContext definition is needed for a pod. Add proper security context to pipelineloop controler and webhook Signed-off-by: Yihong Wang <yh.wang@ibm.com> Signed-off-by: Yihong Wang <yh.wang@ibm.com>

Ark-kun self-assigned this Apr 10, 2019

vicaire assigned vicaire and unassigned Ark-kun Apr 11, 2019

vicaire added area/backend kind/feature priority/p1 labels Apr 11, 2019

Ark-kun closed this as completed Apr 12, 2019

vicaire reopened this Apr 12, 2019

vicaire added area/wide-impact and removed area/backend labels May 29, 2019

vicaire assigned IronPan and unassigned vicaire Jul 16, 2019

stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 25, 2020

k8s-ci-robot added lifecycle/frozen and removed lifecycle/stale The issue / pull request is stale, any activities remove this label. labels Jun 26, 2020

chensun closed this as completed Jan 18, 2023

This was referenced Oct 14, 2024

[Snyk] Fix for 22 vulnerabilities VaniHaripriya/data-science-pipelines#331

Open

[Snyk] Fix for 22 vulnerabilities VaniHaripriya/data-science-pipelines#332

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose Pipelines as CRD and enable to easy migration from Argo workflow #1132

Expose Pipelines as CRD and enable to easy migration from Argo workflow #1132

inc0 commented Apr 10, 2019

Ark-kun commented Apr 10, 2019

inc0 commented Apr 11, 2019

vicaire commented Apr 11, 2019

Ark-kun commented Apr 12, 2019

vicaire commented Apr 12, 2019

yanniszark commented Aug 12, 2019

stale bot commented Jun 25, 2020

Bobgy commented Jun 26, 2020

alexlatchford commented Feb 2, 2021

Bobgy commented Feb 3, 2021

alexlatchford commented Feb 3, 2021

rubenaranamorera commented Aug 23, 2021

kujon commented Oct 6, 2021

chensun commented Jan 18, 2023

laurence-hudson-mindfoundry commented Aug 9, 2023

Expose Pipelines as CRD and enable to easy migration from Argo workflow #1132

Expose Pipelines as CRD and enable to easy migration from Argo workflow #1132

Comments

inc0 commented Apr 10, 2019

Ark-kun commented Apr 10, 2019

inc0 commented Apr 11, 2019

vicaire commented Apr 11, 2019

Ark-kun commented Apr 12, 2019

vicaire commented Apr 12, 2019

yanniszark commented Aug 12, 2019

stale bot commented Jun 25, 2020

Bobgy commented Jun 26, 2020

alexlatchford commented Feb 2, 2021

Bobgy commented Feb 3, 2021

alexlatchford commented Feb 3, 2021

rubenaranamorera commented Aug 23, 2021

kujon commented Oct 6, 2021

chensun commented Jan 18, 2023

laurence-hudson-mindfoundry commented Aug 9, 2023