Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sample] CI Sample: mnist #3013

Merged
merged 21 commits into from
Feb 24, 2020
Merged

Conversation

dldaisy
Copy link
Contributor

@dldaisy dldaisy commented Feb 7, 2020

#2 piece for CI samples:
<To make the repo more easily reviewable, pr 2784(#2784) will be broken into smaller prs>
This sample is the mnist sample for versioned pipeline CI. Data scientists can make the least effort to migrate their original code to implement KFP CI by wrapping the whole training code in one step and other adds-on like visualizations in other steps.

  • Use cloudbuild for CI process
  • Use KFP SDK to connect to KFP API server

This change is Reviewable

@k8s-ci-robot
Copy link
Contributor

Hi @dldaisy. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dldaisy
Copy link
Contributor Author

dldaisy commented Feb 7, 2020

@numerology @jingzhang36

@dldaisy
Copy link
Contributor Author

dldaisy commented Feb 8, 2020

/retest

@k8s-ci-robot
Copy link
Contributor

@dldaisy: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -0,0 +1,25 @@
# Mnist CI Pipeline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first occurrence of a terminology better be fully spelled out. Continuous Integration (CI)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

# Mnist CI Pipeline

## What you can learn in this sample
* CI process of a simple but general ML pipeline.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also mention that this CI is using cloud build service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@jingzhang36
Copy link
Contributor

Thanks for the sample! A couple of minor comments. Otherwise, /lgtm

@numerology
Copy link

/ok-to-test


This sample uses cloud build to implement the continuous integration process of a basic machine learning pipeline that trains and visualizes model in tensorboard. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines.

We use *Kubeflow Pipeline(KFP) SDK** to interact with kubeflow pipeline to create a new version and a run in this sample.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We use *Kubeflow Pipeline(KFP) SDK** to interact with kubeflow pipeline to create a new version and a run in this sample.
We use *Kubeflow Pipeline(KFP) SDK* to interact with kubeflow pipeline to create a new version and a run in this sample.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will modify.



## What needs to be done before run
* Create a secret following the troubleshooting parts in [https://github.com/kubeflow/pipelines/tree/master/manifests/kustomize]()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put the link in the brackets to make it work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Thanks.

Copy link

@numerology numerology left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dldaisy ! Left some comment

@@ -0,0 +1,31 @@
# Mnist Continuous Integration(CI) Pipeline

## ## Overview

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## ## Overview
## Overview

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

train_step = train_op(storage_bucket=storage_bucket).apply(use_gcp_secret('user-gcp-sa'))

visualize_op = components.load_component_from_file('./tensorboard/component.yaml')
visualize_step = visualize_op(logdir='%s' % train_step.outputs['logdir']).apply(use_gcp_secret('user-gcp-sa'))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: With hosted pipeline beta and workload identity, user-gcp-sa will be deprecated. Do you mind adding a comment (and perhaps also a TODO) to track this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. I'll leave a comment in readme for this.


parser = argparse.ArgumentParser()
parser.add_argument('--logdir', type=str)
parser.add_argument('--output_path', type=str, default='/')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick Q: what if a user specify --output_path. Then in line 17 it seems to me that mlpipeline-ui-metadata.json will no longer at /mlpipeline-ui-metadata.json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. I'll make it configurable.

@numerology
Copy link

/approve
LGTM modulo some nit issues.
@jingzhang36 can LGTM in case I'm asleep. Thanks! @dldaisy

'source': args.logdir,
}]
}
with open(args.output_path+'mlpipeline-ui-metadata.json', 'w') as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this path configurable (add a dedicated command-line argument).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks!

'source': args.logdir,
}]
}
with open(args.output_path+'mlpipeline-ui-metadata.json', 'w') as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might need to create the directory fist.

from pathlib import Path
Path(args.ui_metadata_path).parent.mkdir(parents=True, exist_ok=True)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thank you!

command: ['python', '/tensorboard.py']
args: ['--logdir', {inputValue: logdir}]
fileOutputs:
MLPipeline UI metadata: /mlpipeline-ui-metadata.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to have all paths configurable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

command: ['python', '/mnist.py']
args: ['--storage_bucket', {inputValue: storage_bucket}]
fileOutputs:
logdir: /logdir.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to have all paths configurable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@jingzhang36
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jingzhang36, numerology

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jingzhang36
Copy link
Contributor

/test kubeflow-pipeline-e2e-test

@k8s-ci-robot k8s-ci-robot merged commit 7905856 into kubeflow:master Feb 24, 2020
Jeffwan pushed a commit to Jeffwan/pipelines that referenced this pull request Dec 9, 2020
* mnist sample init

* remove comma in component.yaml

* image cannot use placeholder

* fileOutputs instead of file_outputs

* ui metadata

* input reference

* rm tensorboard_image input

* train op with yaml

* train component.yaml

* fix typo

* sed image name

* sed quote

* latest instead of commit sha

* typo and logic sequence of push and execute

* latest to commit_sha

* rm debug print

* more instructions

* refined readme.md

* fix bugs in readme.md

* readme user-gcp-sa

* configurable output path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants