Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Kubernetes (with Argo) #50

Closed
nlaille opened this issue Dec 12, 2019 · 17 comments · Fixed by #644
Closed

Support for Kubernetes (with Argo) #50

nlaille opened this issue Dec 12, 2019 · 17 comments · Fixed by #644
Labels
enhancement New feature or request

Comments

@nlaille
Copy link

nlaille commented Dec 12, 2019

Another implementation of #16

This idea is to provide metaflow with native kubernetes implementation using Argo (https://github.com/argoproj/argo) for the workflow part.

@savingoyal savingoyal added the enhancement New feature or request label Dec 12, 2019
@valayDave
Copy link
Collaborator

@nlaille : I was recently exploring building plugin support for Argo but the DAG templating seemed a little complicated to fit into Metaflow within a short amount of time.

So meanwhile I have built a plugin to support Kubernetes with isolated self packaged deployments of the runtime itself to the cluster. It will require a service-based metadata provider. It will deploy the entire workflow's runtime as a job that will orchestrate different containerized jobs. So deploying your entire flow with its container orchestration on Kubernetes becomes one command: python myflow.py --with kube:cpu=2,memory=2000,image=python:3.7 kube-deploy run --custom_param_1 hello --custom_param_2 100

It works using the @kube decorator. Supports the same features as the batch plugin. You can even use the --with kube:cpu=2,memory=2000,image=tensorflow/tensorflow:latest-py3.

https://github.com/valayDave/metaflow/tree/kube_cpu_stable

Documentation is in the links within that repo. The plugin is under constant development so there will be more features soon and hopefully even support Argo. But it is currently tied to S3 as its datastore and doesn't support GPUs.

Hope this helps!

@repocho
Copy link
Contributor

repocho commented Sep 10, 2020

Is there a plan to include it in Metaflow official repository?

@savingoyal savingoyal linked a pull request Feb 9, 2021 that will close this issue
6 tasks
@alexec
Copy link

alexec commented Aug 3, 2021

Argo team would be happy to help make this a reality.

@savingoyal
Copy link
Collaborator

@alexec That would be wonderful! Let me email you to set up some time.

@alexec
Copy link

alexec commented Aug 3, 2021

You can book 30m: https://calendly.com/argoproj/30m

@savingoyal savingoyal linked a pull request Aug 3, 2021 that will close this issue
5 tasks
@repocho
Copy link
Contributor

repocho commented Aug 4, 2021

Ohhh my good. Thanks for this collaboration @alexec and @savingoyal !!!

@terrytangyuan
Copy link

Is there any update on this? BTW, the URL for Argo Workflows has changed and the old URL does not redirect correctly. Here's the new location: https://github.com/argoproj/argo-workflows

@alexec
Copy link

alexec commented Oct 6, 2021

This is an extremely popular issue. It'd be great to see it implemented.

@savingoyal
Copy link
Collaborator

If you would like to try out and give feedback on our Kubernetes integration, please reach out at http://slack.outerbounds.co

@terrytangyuan
Copy link

@savingoyal Looks like the current support for K8s is added via K8s Python API and noticed the following the doc:

Scheduling Metaflow flows on Argo
This functionality is in-flight!

How's the integration with Argo Workflows coming along?

@terrytangyuan
Copy link

FYI we now have a low-level Python SDK officially maintained by the Argo Workflows team. It should be much easier to use it to integrate Metaflow with Argo Workflows. https://github.com/argoproj/argo-workflows/tree/master/sdks/python

@savingoyal
Copy link
Collaborator

#992 is now available for review

@savingoyal
Copy link
Collaborator

#992 has been merged. The next Metaflow release will address this issue.

@alexec
Copy link

alexec commented Apr 22, 2022

Yay!

@terrytangyuan
Copy link

Amazing work! Looking forward to the new release!

@repocho
Copy link
Contributor

repocho commented Apr 22, 2022

Thank you for making this possible !! 🥇

@savingoyal
Copy link
Collaborator

Support for Kubernetes and Argo Workflows is now available in Metaflow 2.6.0 - https://blog.argoproj.io/human-centric-data-science-on-kubernetes-with-metaflow-7f60aad34cba

iamsgarg-ob added a commit to iamsgarg-ob/metaflow that referenced this issue Jul 29, 2024
- add a terminationMessagePolicy to FallbackToLogsOnError
iamsgarg-ob added a commit to iamsgarg-ob/metaflow that referenced this issue Jul 29, 2024
- add a terminationMessagePolicy to FallbackToLogsOnError
iamsgarg-ob added a commit to iamsgarg-ob/metaflow that referenced this issue Jul 29, 2024
- add a terminationMessagePolicy to FallbackToLogsOnError
iamsgarg-ob added a commit to iamsgarg-ob/metaflow that referenced this issue Jul 29, 2024
- add a terminationMessagePolicy to FallbackToLogsOnError
savingoyal pushed a commit that referenced this issue Jul 29, 2024
- add a terminationMessagePolicy to FallbackToLogsOnError
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
6 participants