Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable local test runner for Kubeflow Pipelines #1138

Closed
neuromage opened this issue Apr 11, 2019 · 18 comments · Fixed by #4983
Closed

Enable local test runner for Kubeflow Pipelines #1138

neuromage opened this issue Apr 11, 2019 · 18 comments · Fixed by #4983

Comments

@neuromage
Copy link
Contributor

Today, there is no way to run a KFP pipeline on a local machine (outside of running a mini Kubernetes cluster locally). This makes testing pipelines for user difficult as they always have to go through the steps of uploading to a cluster to run even small tests on the correctness of the pipeline.

To solve this problem, it would be nice if KFP offers a way to run pipelines locally. Such a mode does not have to have full fidelity with what's possible in a cluster run with Argo to begin with. A simple approach of using docker run to run each step sequentially, with a mounted local volume for passing parameters, will go a long way towards satisfying most local-run use-cases.

As a suggestion, we can modify the pipeline decorator to enable this behaviour when the user adds a keyword like test = True, i.e.

@dsl.Pipeline(
   name = "..",
   test = True,
   ...
@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the label feature_request to this issue, with a confidence of 0.98. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: dashboard, app homepage and code for this bot.

@vicaire
Copy link
Contributor

vicaire commented Apr 11, 2019

Some ideal requirements to consider:

  • The local executor should support all the same features as execution within a GKE cluster.
  • The local executor should not duplicate the code of the controllers used to execute the pipeline.
  • The local executor should execute the yaml, not Python, so that both yaml and Python can be tested.

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 12, 2019

The local executor should support all the same features as execution within a GKE cluster.

I guess only the proper local kubernetes cluster can satisfy this requirement. The test environment needs Minikube installed.

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 12, 2019

As a suggestion, we can modify the pipeline decorator to enable this behaviour when the user adds a keyword like test = True, i.e.

I think it's better have a function to perform this instead of overloading the decorator:

kfp.run_pipeline_locally(my_pipeline, arguments={...})

If we want to differentiate, we can have

  • run_pipeline_locally
  • run_pipeline_using_docker
  • run_pipeline_on_kubernetes
  • run_pipeline_on_kfp

@vicaire
Copy link
Contributor

vicaire commented Apr 12, 2019

BTW, what about providing users with a VM image that has all the tools installed (minikube and etc.) to make local execution easy?

@ucdmkt
Copy link
Contributor

ucdmkt commented Apr 12, 2019

Thank you so much for tracking this issue.

Another thing to consider is that we may want to have a way to inject a stub to a component inside the pipeline, if a component is making call to foreign services such as Dataflow.

@kevinbache
Copy link
Contributor

related: #1104

@Ark-kun Ark-kun self-assigned this Oct 19, 2019
@neuromage
Copy link
Contributor Author

Closing as infeasible/obsolete for now.

@neuromage
Copy link
Contributor Author

@Ark-kun not sure if you intentionally meant to re-open this issue? Are you working on local runner for KFP?

@Ark-kun
Copy link
Contributor

Ark-kun commented May 28, 2020

@Ark-kun not sure if you intentionally meant to re-open this issue? Are you working on local runner for KFP?

I think that your idea is still pretty valid and useful. It would be nice to have that feature for component testing and experimentation. I've recently started creating components in my free time and testing them was not that easy without having a Kubernetes cluster.
Our SDK already runs some tests in a local environment, but it would be less hacky to just use docker.
So I just wanted to keep this feature request in mind.
The priority of this is not P0 or P1 of course.

What do you think?

@rmgogogo
Copy link
Contributor

FYI, I'm thinking on bring a lightweight local runner in a new design. It may rely on TFX SDK's docker launcher. It's still early investigation phase, together with IR work.

@numerology
Copy link

FYI, for testing TFX has this tensorflow/tfx#1986 WIP.

That said it's pretty different than local dev.

@rmgogogo
Copy link
Contributor

(in IR based impl, we may handle it with Docker runner)

@stale
Copy link

stale bot commented Sep 20, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Sep 20, 2020
@Bobgy
Copy link
Contributor

Bobgy commented Sep 23, 2020

/lifecycle frozen
we'd still want this open

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen and removed lifecycle/stale The issue / pull request is stale, any activities remove this label. labels Sep 23, 2020
@NikeNano
Copy link
Member

Are we interested in adding instructions on how to deploy locally to Minikube based upon the local code for development @Bobgy? Will not be a big change but would require the user to build all the images and add an overlay that uses these images instead of the official released images.

@Bobgy
Copy link
Contributor

Bobgy commented Nov 15, 2020

@NikeNano I think that's a different topic, this issue is about creating a local runner that helps experiment with KFP. Rather than developing KFP.

@lynnmatrix
Copy link
Member

lynnmatrix commented Jan 13, 2021

@Bobgy @Ark-kun @numerology I have create a PR(#4983) trying to provide a local runner which will run Kubeflow pipeline on docker or locally.
As @Ark-kun suggested, kfp.run_pipeline_func_locally(my_pipeline, arguments={...}) is picked.

google-oss-robot pushed a commit that referenced this issue Feb 24, 2021
…ixes #1138 (#4983)

* add local runner which will run ops in docker or locally

* use str.format rather than f-string

* add some brief doc string in local client

* comment the unittest about running op in docker, which is not supported in CI env for now

* Add some brief docstring about DAG used in local client

* make graph/reverse_graph of DAG as property to keep them in sync

* make some methods of LocalClient static

* remove circular reference in local client

* Incapsulate artifact storage root in the constuctor of LocalClient

* Add Alpha notice for kfp.run_pipeline_func_locally

* Support list of local images in kfp.run_pipeline_func_locally

* make staticmethod to module level private method

* Trivial modification according to code review, some renaming or docstring

* local runner support components without '--' as argument prefix

* make output file of op in loop unique

* Local runner decides whether run component in docker or in local process base on ExecutionMode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment