Enable flexible and custom visualizations for artifacts produced by Kubeflow Pipelines #1472

neuromage · 2019-06-07T17:03:20Z

We'd like to enable users to easily add visualizations of artifacts in the UI. Today, the KFP UI only supports a few basic visualizations, and custom visualizations (such as those produced by TFMA/TFDV libraries) need a lot of custom work by the user to produce a HTML file that can be displayed in the KFP UI. This issue is meant to track work for greatly simplifying the work needed by the user to achieve the same effect.

neuromage · 2019-06-07T17:03:37Z

/assign @ajchili

k8s-ci-robot · 2019-06-07T17:03:38Z

@neuromage: GitHub didn't allow me to assign the following users: ajchili.

Note that only kubeflow members and repo collaborators can be assigned and that issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @ajchili

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

rummens · 2019-06-08T07:44:58Z

You are talking about custom visualization, does this mean the TFX components already have support for the basic visualization options of kubeflow?

For now I would we happy to have e.g. tables of the input data or a tensorboard of the training. This should only require writing that data to the metric file for Kubeflow.

I am happy to assist on this but don’t want to reinvent the wheel. So if any works has already been done this, maybe we can align first.

Created local demonstration of kubeflow#1472 using a new viewer component.

rummens · 2019-07-03T09:53:16Z

Any updates on this? @ajchili does your local demonstrator work?

ajchili · 2019-07-08T17:59:00Z

@rummens I am still working on a demo for this functionality. Unfortunately, I do not have one working yet, I am hoping to have a working demo by the end of this week.

rummens · 2019-07-09T15:05:15Z

No problem and thanks for the update.

Which components will your development support? All TFX components or just specific ones, the most interesting ones being transform, Trainer and Model Validation?

ajchili · 2019-07-09T15:40:51Z

I am starting with TFDV and an ROC curve for the demo. Afterwards, I am hoping to have support for any Python visualization library.

rummens · 2019-07-09T15:59:31Z

Sounds awesome, can’t wait to test ;-)

Is tensorboard for the Trainer supported yet? KFP already has it integrated, just not sure if the user has to change the trainer component in order to use it?

neuromage · 2019-07-09T16:30:11Z

@jingzhang36 @rileyjbauer can we enable tensorboard link outside of ui_metadata.json? I'm thinking it would show up when the artifact being passed around is TFX's ModelExportPath, which should contain the uri to the logs for Tensorboard.

jingzhang36 · 2019-07-10T08:41:39Z

@neuromage Riley shall know best on this, but I feel like we can if we want to enable tb link outside of /mlpipeline-ui-metadata.json. BTW, can we have the uri to logs (contained in ModelExportPath) copied to our /mlpipeline-ui-metadata.json.......

ajchili · 2019-07-13T07:12:26Z

@rummens I have a basic e2e demo running within my cluster. It is not user-friendly at the moment and requires that you edit the ml-pipeline and ml-pipeline-ui deployment and create a new service and deployment. If you would like to test it early here are the steps to get started. If you run into any issues with these steps please let me know!

Fork/clone ajchili/pipelines
Checkout e2e-demo with git checkout e2e-demo
Build the API server Docker image
- cd backend
- docker build -t gcr.io/PROJECT/api:vVERSION -f backend/Dockerfile . --build-arg use_remote_build=true --build-arg google_application_credentials="${GCP_CREDENTIALS}"
  - Replace PROJECT with your project id and VERSION with the version id
  - keep in mind you must apply GCP credentials beforehand with GCP_CREDENTIALS="$(gsutil cat gs://PROJECT/credentials.json)")
    - This will depend on where your service account credentials are stored and may require some additional steps
Push the API server Docker image to gcr.io with docker push gcr.io/PROJECT/api:vVERSION
Update your ml-pipeline deployment with kubectl edit deploy -n kubeflow ml-pipeline and replace the image with gcr.io/PROJECT/api:vVERSION
Build the frontend Docker image
- cd ../frontend
- docker build -f ./Dockerifle .. -t gcr.io/PROJECT/frontend:vVERSION
  - Replace PROJECT with your project id and VERSION with the version id
Push the frontend Docker image to gcr.io with docker push gcr.io/PROJECT/frontend:vVersion
Update your ml-pipeline-ui deployment with kuebctl edit deploy -n kubeflow ml-pipeline-ui and replace the image with gcr.io/PROJECT/frontend:vVERSION
Deploy the python visualization deployment
- cd ../backend/src/apiserver/visualizations
- kubectl apply -f ./deployment.yaml
Deploy the python visualization service
- kubectl apply -f ./service.yaml
Open pipelines in your browser and navigate to a run that has a ROC Curve component in it
Click the ROC Curve component to view the new method of visualization

Again, it is not in a finished state and should be used with that understanding.

rummens · 2019-07-17T15:58:10Z

Thanks very much for the update, I am trying to find some time to test it. So far I can report that the images build fine, that all I could get done so far. Sorry for delay but I promise to try it out soon.

ajchili · 2019-07-17T16:52:22Z

@rummens there is no rush! Thanks for taking the time to go through this and test it out. As an FYI, I have added basic support for TFDV visualizaitons. You will need to be at commit 808afa451362363041fac652093cabed08262040. Once this is done, follow these steps to view the visualization. As a side note, there is an unresolved issue with timeouts which may cause the TFDV visualization to fail, if it happens please try to rerun the visualization.

Steps:

Get project to commit 808afa451362363041fac652093cabed08262040.
Update frontend image and python service image.
1. This will follow the same steps from my comment above.
Open a pipeline run that has a component with output.
1. This is important as generating visualizations from the UI is currently only possible with a component that has output.
Click on the component that has output.
Click the "Generate Visualization" button.
In the first prompt, provide the input data path for TFDV.
1. This will be a csv file and it MUST be located within GCS.
2. Go here for more information about TFDV visualizations.
In the second prompt, provide the following string --type tfdv.
Wait for the visualization to generate.
1. As mentioned above, it is possible that a timeout can occur due to the length of time required to generate a TFDV visualization.
2. To determine if a timeout has occurred, keep your browser's console opened and see if a console.error message has been logged that indicates a 504 error has occurred.
3. If this has happened, the visualization timed out. I currently do not have a solution for this. I am investigating what can be done to mitigate this. For now you must attempt to re-generate the visualization.

ajchili · 2019-07-18T19:41:23Z

A follow up to the comment above. It appears that the usage of ambassador could be leading to unexpected timeout when making API requests from the frontend. One method to circumvent this issue is to follow these steps in order to setup a new cluster with a lightweight version of pipelines.

rummens · 2019-07-24T06:20:22Z

Are you running against 0.5 or 0.6 because in 0.6 the ambassador seems to be replaced by istio?

ajchili · 2019-07-24T16:11:25Z

@rummens previously I was running 0.5 but I recently switched to 0.6 due to the switch away from ambassador. I can also confirm that switching to 0.6 resolved the timeout issue. Upgrading to 0.6 is not required but it removed the timeout limitation.

ajchili · 2019-08-30T18:46:19Z

/close
Python based visualizations have been merged into Kubeflow Pipelines and will be a part of the v0.1.28 release. Information about Python based visualizations can be found here.

k8s-ci-robot · 2019-08-30T18:46:21Z

@ajchili: Closing this issue.

In response to this:

/close
Python based visualizations have been merged into Kubeflow Pipelines and will be a part of the v0.1.28 release. Information about Python based visualizations can be found here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

* Swap boto3 in for minio in storage initializer * This commit includes the following changes: - Refactored tests to mock boto3 while keeping same behaviour - Skip directory object itself (any object with a trailing '/') when filtering/listing bucket objects - Keep full path when downloading single object - Added some comments to _download_s3 method - Use latest boto3/botocore versions * Linting fixes Co-authored-by: Matthew Conniff <mvconniff@gmail.com>

neuromage mentioned this issue Jun 7, 2019

Add visualizations for custom pipelines in Kubeflow on GKE tensorflow/tfx#37

Closed

ajchili added a commit to ajchili/pipelines that referenced this issue Jun 17, 2019

Started work on kubeflow#1472

23ea5e2

Created local demonstration of kubeflow#1472 using a new viewer component.

ajchili added a commit to ajchili/pipelines that referenced this issue Jun 18, 2019

Started work on kubeflow#1472

6f7767f

Created local demonstration of kubeflow#1472 using a new viewer component.

ajchili added a commit to ajchili/pipelines that referenced this issue Jun 18, 2019

Started work on kubeflow#1472

580cc4c

Created local demonstration of kubeflow#1472 using a new viewer component.

jessiezcc assigned ajchili Jul 9, 2019

ajchili mentioned this issue Aug 22, 2019

[FR] Add support for Vega/Vega Lite visualizations #1924

Closed

k8s-ci-robot closed this as completed Aug 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable flexible and custom visualizations for artifacts produced by Kubeflow Pipelines #1472

Enable flexible and custom visualizations for artifacts produced by Kubeflow Pipelines #1472

neuromage commented Jun 7, 2019

neuromage commented Jun 7, 2019

k8s-ci-robot commented Jun 7, 2019

rummens commented Jun 8, 2019

rummens commented Jul 3, 2019

ajchili commented Jul 8, 2019

rummens commented Jul 9, 2019

ajchili commented Jul 9, 2019

rummens commented Jul 9, 2019

neuromage commented Jul 9, 2019

jingzhang36 commented Jul 10, 2019

ajchili commented Jul 13, 2019 •

edited

Loading

rummens commented Jul 17, 2019

ajchili commented Jul 17, 2019

ajchili commented Jul 18, 2019

rummens commented Jul 24, 2019

ajchili commented Jul 24, 2019

ajchili commented Aug 30, 2019

k8s-ci-robot commented Aug 30, 2019

Enable flexible and custom visualizations for artifacts produced by Kubeflow Pipelines #1472

Enable flexible and custom visualizations for artifacts produced by Kubeflow Pipelines #1472

Comments

neuromage commented Jun 7, 2019

neuromage commented Jun 7, 2019

k8s-ci-robot commented Jun 7, 2019

rummens commented Jun 8, 2019

rummens commented Jul 3, 2019

ajchili commented Jul 8, 2019

rummens commented Jul 9, 2019

ajchili commented Jul 9, 2019

rummens commented Jul 9, 2019

neuromage commented Jul 9, 2019

jingzhang36 commented Jul 10, 2019

ajchili commented Jul 13, 2019 • edited Loading

rummens commented Jul 17, 2019

ajchili commented Jul 17, 2019

Steps:

ajchili commented Jul 18, 2019

rummens commented Jul 24, 2019

ajchili commented Jul 24, 2019

ajchili commented Aug 30, 2019

k8s-ci-robot commented Aug 30, 2019

ajchili commented Jul 13, 2019 •

edited

Loading