-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sample] CI Sample: mnist #3013
Conversation
Hi @dldaisy. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
@dldaisy: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -0,0 +1,25 @@ | |||
# Mnist CI Pipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first occurrence of a terminology better be fully spelled out. Continuous Integration (CI)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
# Mnist CI Pipeline | ||
|
||
## What you can learn in this sample | ||
* CI process of a simple but general ML pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also mention that this CI is using cloud build service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
Thanks for the sample! A couple of minor comments. Otherwise, /lgtm |
/ok-to-test |
|
||
This sample uses cloud build to implement the continuous integration process of a basic machine learning pipeline that trains and visualizes model in tensorboard. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines. | ||
|
||
We use *Kubeflow Pipeline(KFP) SDK** to interact with kubeflow pipeline to create a new version and a run in this sample. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use *Kubeflow Pipeline(KFP) SDK** to interact with kubeflow pipeline to create a new version and a run in this sample. | |
We use *Kubeflow Pipeline(KFP) SDK* to interact with kubeflow pipeline to create a new version and a run in this sample. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Will modify.
|
||
|
||
## What needs to be done before run | ||
* Create a secret following the troubleshooting parts in [https://github.com/kubeflow/pipelines/tree/master/manifests/kustomize]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put the link in the brackets to make it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dldaisy ! Left some comment
@@ -0,0 +1,31 @@ | |||
# Mnist Continuous Integration(CI) Pipeline | |||
|
|||
## ## Overview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## ## Overview | |
## Overview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
train_step = train_op(storage_bucket=storage_bucket).apply(use_gcp_secret('user-gcp-sa')) | ||
|
||
visualize_op = components.load_component_from_file('./tensorboard/component.yaml') | ||
visualize_step = visualize_op(logdir='%s' % train_step.outputs['logdir']).apply(use_gcp_secret('user-gcp-sa')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: With hosted pipeline beta and workload identity, user-gcp-sa
will be deprecated. Do you mind adding a comment (and perhaps also a TODO) to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course. I'll leave a comment in readme for this.
|
||
parser = argparse.ArgumentParser() | ||
parser.add_argument('--logdir', type=str) | ||
parser.add_argument('--output_path', type=str, default='/') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick Q: what if a user specify --output_path
. Then in line 17 it seems to me that mlpipeline-ui-metadata.json will no longer at /mlpipeline-ui-metadata.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. I'll make it configurable.
/approve |
'source': args.logdir, | ||
}] | ||
} | ||
with open(args.output_path+'mlpipeline-ui-metadata.json', 'w') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this path configurable (add a dedicated command-line argument).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks!
'source': args.logdir, | ||
}] | ||
} | ||
with open(args.output_path+'mlpipeline-ui-metadata.json', 'w') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might need to create the directory fist.
from pathlib import Path
Path(args.ui_metadata_path).parent.mkdir(parents=True, exist_ok=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thank you!
command: ['python', '/tensorboard.py'] | ||
args: ['--logdir', {inputValue: logdir}] | ||
fileOutputs: | ||
MLPipeline UI metadata: /mlpipeline-ui-metadata.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to have all paths configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
command: ['python', '/mnist.py'] | ||
args: ['--storage_bucket', {inputValue: storage_bucket}] | ||
fileOutputs: | ||
logdir: /logdir.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to have all paths configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jingzhang36, numerology The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test kubeflow-pipeline-e2e-test |
* mnist sample init * remove comma in component.yaml * image cannot use placeholder * fileOutputs instead of file_outputs * ui metadata * input reference * rm tensorboard_image input * train op with yaml * train component.yaml * fix typo * sed image name * sed quote * latest instead of commit sha * typo and logic sequence of push and execute * latest to commit_sha * rm debug print * more instructions * refined readme.md * fix bugs in readme.md * readme user-gcp-sa * configurable output path
#2 piece for CI samples:
<To make the repo more easily reviewable, pr 2784(#2784) will be broken into smaller prs>
This sample is the mnist sample for versioned pipeline CI. Data scientists can make the least effort to migrate their original code to implement KFP CI by wrapping the whole training code in one step and other adds-on like visualizations in other steps.
This change is