-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Development friendly Kubeflow experience #5013
Comments
@JoshZastrow Just to be clear, are you talking about Kubeflow as a whole or pipelines specifically? In regards to pipelines, it is possible to create python function based components rather than needing to create images (you do need to have a base image that contains the necessary dependencies such as pytorch for example). https://www.kubeflow.org/docs/pipelines/sdk/python-function-components/ |
https://github.com/kubeflow-kale/kale this might be useful. |
Hi @davidspek , ah yes I should have been more specific--I am talking more about Kubeflow Pipelines. Seems like even with python based functions, anything that gets imported needs to exist on the image. For fast development--perhaps the way to go is make every single function in the application a component. This is just a little hard to adopt for an existing python project that already has its own packages, modules, functions and classes. example:
The pipeline would be built from components, but there's application code in @munagekar ah yeah I like Kale! This could be a very cool tool (and a big notebook user myself) but the devs on my team actually prefer to develop the pipeline in a |
/area pipelines |
This is exactly what we implemented in my organization. We use git tags instead of latest. |
Some documentation you can refer to: https://cloud.google.com/solutions/machine-learning/architecture-for-mlops-using-tfx-kubeflow-pipelines-and-cloud-build There's a CI/CD pipeline needed to deploy the CT (continuous training) pipeline that runs in KFP. |
Looks like this open PR would help with this: #4983 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it. |
/kind feature
Why you need this feature:
Say I have a local package containing my application logic (i.e cleaning, feature generation, ML model training, etc..). This local package contains modules and functions used in my component.
I want to make changes to the application logic (i.e change a feature scaling method), then run my pipeline and 1) make sure the pipeline works or 2) see an improvement in my offline metrics.
My component image needs to have all the dependencies on the image, so this seems to mean that if I want to run my kubeflow pipeline with new code, I need to re-build and submit an image each time. This is a pretty slow process, and prevents us from wanting to make smaller components (better to develop pipelines in Python and run them as a bigger component via a CLI command).
I'm imagining one solution with a local Kubeflow instance, that has the component images pointing to locally built docker images that have the local application code mounted, you can get a much faster iteration cycle.
Is there a better way to develop faster with Kubeflow? It says it's experimentation friendly, but I haven't felt that from working with Kubeflow so far (it is nice that it has experiment management/tracking in the UI though!). I don't feel like I can swap my current experimentation workflow out for Kubeflow.
Maybe a user guide on developing locally could be a good solution? Something equivalent to
pip install -e .
for Kubeflow components would be great!The text was updated successfully, but these errors were encountered: