[Sample] Demonstrate Continuous Integration #2784

dldaisy · 2019-12-30T10:17:53Z

Changes:

Several samples on CI with versioned pipelines, including:
- helloworld-ci-sample, using cloudbuild for CI, curl for creating pipeline version and run
- jenkins-ci-sample, using jenkins for CI, curl for creating pipeline version and run
- mnist-ci-sample, using cloudbuild for CI, sdk client for creating pipeline version and run.
- kaggle-ci-sample, using cloudbuild for CI, sdk client for creating pipeline version and run. Use kaggle python API to download and submit data.
Test of sdk client creating a batch of pipelines and versions

This change is

…eld.

…oard3.x

numerology · 2019-12-31T23:05:29Z

samples/contrib/versioned-pipeline-ci-samples/helloworld-ci-sample/README.md

+
+## Overview
+
+This sample use cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines. 


Suggested change

This sample use cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines.

This sample uses cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines.

numerology · 2019-12-31T23:06:38Z

samples/contrib/versioned-pipeline-ci-samples/helloworld-ci-sample/pipeline.py

+import argparse
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--commit_id', help='Commit Id', type=str)


Maybe elaborate more in the help field?

numerology · 2019-12-31T23:09:48Z

samples/contrib/versioned-pipeline-ci-samples/helloworld-ci-sample/pipeline.py

+    train = dsl.ContainerOp(
+        name='mnist train',
+        image = os.path.join(gcr_address, 'mnist_train:', args.commit_id)
+    ).apply(use_gcp_secret('user-gcp-sa'))


Is user-gcp-sa secret still valid under workload identity based deployment? @Bobgy
I remember the usage of user-gcp-sa in samples has been cleaned up.

No, it won't be used.
Recommend removing its usage and put a link to https://www.kubeflow.org/docs/gke/authentication-pipelines/ nearby about how to authenticate to GCP.

Does it also mean that we don't need to create a 'user-gcp-sa' secret volumn mounted on the cluster anymore?

No, we won't need to. Use workload identity if we care about security, and use full scope cluster for convenience.

jingzhang36 · 2020-01-03T03:58:30Z

/assign @jingzhang36
/assign @gaoning777
/assign @numerology

numerology · 2020-01-03T05:39:27Z

...es/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/create_pipeline_version_and_run.py

+parser = argparse.ArgumentParser()
+parser.add_argument('--version_name', help='Required. Name of the new version. Must be unique.', type=str)
+parser.add_argument('--package_url', help='Required. pipeline package url', type=str)
+parser.add_argument('--pipeline_id', help = 'Required. pipeline id',type=str)


nit: consistent style.

Suggested change

parser.add_argument('--pipeline_id', help = 'Required. pipeline id',type=str)

parser.add_argument('--pipeline_id', help='Required. pipeline id', type=str)

numerology · 2020-01-03T05:42:18Z

...es/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/create_pipeline_version_and_run.py

+else:
+    client = kfp.Client()
+
+print('your client configuration is :{}'.format(client.pipelines.api_client.configuration.__dict__))


Suggested change

print('your client configuration is :{}'.format(client.pipelines.api_client.configuration.__dict__))

print('Your client configuration is: {}'.format(client.pipelines.api_client.configuration.__dict__))

numerology · 2020-01-03T05:43:04Z

...es/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/create_pipeline_version_and_run.py

+
+print('your client configuration is :{}'.format(client.pipelines.api_client.configuration.__dict__))
+print('Now in create_pipeline_version_and_run.py...')
+print('your api_client host is:')


nit: maybe also use .format() as above?

Will modify this part.

Ark-kun · 2020-01-04T02:38:30Z

samples/contrib/versioned-pipeline-ci-samples/helloworld-ci-sample/pipeline.py

+    gcr_address: str
+):
+    import os
+    train = dsl.ContainerOp(


Let's try using an actual component instead of ad-hoc ContainerOp:

component.yaml:

name: Train on MNIST implementation: container: image: helloworld-ci

and then

train_op = kfp.components.load_component_from_file('component.yaml') ... train = train_op()

in cloudbuild.yaml we can replace the image name in the component.yaml with the newly-build image:

sed "s|image: helloworld-ci|image: ${_GCR_PATH}/helloworld-ci:$COMMIT_SHA|"

Ark-kun · 2020-01-04T02:41:43Z

samples/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/pipeline.py

+    bucket_name: str
+):
+    import os
+    stepDownloadData = dsl.ContainerOp(


Same here. Let's componentize all these ops.
Each component will go into its own component.yaml file and the build script will replace the image versions inside those files, not in the pipeline.py.

Thanks. Will do.

jingzhang36 · 2020-01-07T05:28:43Z

samples/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/README.md

+## Usage
+
+* Substitute the constants in cloudbuild.yaml
+* Fill in your kaggle_username and kaggle_key in Dockerfiles to authenticate to kaggle. You can get them from an API token created from your kaggle account page.


Probably specify which Dockerfile, e.g., Dockerfile under download_dataset, since there are multiple Dockerfiles in kaggle-ci-sample
Or if this replacement is needed in all Dockerfiles, please also point it out.

jingzhang36 · 2020-01-07T08:59:42Z

samples/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/cloudbuild.yaml

+
+
+substitutions:
+  _CODE_PATH: /workspace/kaggle


I believe the _CODE_PATH here should be something like "/workspace/samples/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample" to match dockerfile locations. Please fix this one and other _CODE_PATH accordingly.

Right. Will modify them.

jingzhang36 · 2020-01-08T05:57:57Z

...les/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/download_dataset/download_data.py

+    test_blob.upload_from_filename('test.csv')
+
+    with open('train.txt', 'w') as f:
+        f.write('gs://'+bucket_name+'/train.csv')


We probably want those datas sets in a public available place instead of asking users to prepare the data sets. E.g., gs://ml-pipeline-playground/shakespeare1.txt in pipeline "[Sample] Basic - Parallel execution". This way, we don't need the download data step?

I think this sample maybe a little different from other samples. If put the data in a public gs bucket, user may not get the full process of completing a kaggle competition, from download, train, to submit.

jingzhang36 · 2020-01-08T06:42:44Z

samples/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/cloudbuild.yaml

+        "${_CODE_PATH}/download_dataset/Dockerfile",
+      ]
+    id: "BuildDownloadDataImage"
+


Do we want to push image after we build it?

Yes, the final step will do.

jingzhang36 · 2020-01-08T09:23:09Z

samples/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample/visualize_html/visualize.py

+    df_train = pd.read_csv(train_file_path)
+    sns.set()
+    cols = ['SalePrice', 'OverallQual', 'GrLivArea', 'GarageCars', 'TotalBsmtSF', 'FullBath', 'YearBuilt']
+    sns.pairplot(df_train[cols], size = 2.5)


Also, it seems that the latest interface uses 'height' instead of 'size'.

Yes, it is. Will modify it.

jingzhang36 · 2020-01-09T09:54:51Z

Also, would it be better to break this pr into several smaller ones? E.g., each example has its own PR? The review and merge could be easier and faster that way.

k8s-ci-robot · 2020-01-09T11:20:02Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign gaoning777
You can assign the PR to them by writing /assign @gaoning777 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

samples/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

dldaisy · 2020-01-12T16:00:22Z

ok. I think I can break them to small pieces of pr.

rsalmond · 2020-02-03T22:58:35Z

...les/contrib/versioned-pipeline-ci-samples/mnist-ci-sample/create_pipeline_version_and_run.py

+parser.add_argument('--package_url', help='Required. pipeline package url', type=str)
+parser.add_argument('--pipeline_id', help = 'Required. pipeline id',type=str)
+parser.add_argument('--gcr_address', help='Required. Your cloud registry path. For example, gcr.io/my-project', type=str)
+parser.add_argument('--host', help='Host address of kfp.Client. Will be get from cluster automatically, type=str, default='')


Please forgive comments from passerby, I noticed this arg is missing a closing '.

Thanks for reply! Will modify it in new pull request.

k8s-ci-robot · 2020-03-13T23:40:55Z

@dldaisy: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
kubeflow-pipeline-sample-test	`ab6faad`	link	`/test kubeflow-pipeline-sample-test`
kubeflow-pipeline-upgrade-test	`ab6faad`	link	`/test kubeflow-pipeline-upgrade-test`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

stale · 2020-06-24T10:14:22Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

dldaisy added 30 commits November 29, 2019 11:02

Support select tensorflow image for tensorboard

556ace1

modify test for tensorflow version select

f9259bf

delete not available image entry

8d541fb

Support tensorflow image selection to run tensorboard

e0d2f4b

format code with prettier

c89a22b

use HasPrefix instead of regexp

1bf624a

delete

301ec41

modified tensorboard test

07ba4fc

delete tensorboard

68ae3bf

modify typo

45af753

test tensorboard

324bc83

Merge remote-tracking branch 'upstream/master'

81dab3b

Merge remote-tracking branch 'upstream/master'

9578720

tensorboard test

48bffbc

fuck

65ff6bb

fuck2

df9c9fa

modify test

9a22416

merge master

d896c4c

modify typo in tensorboard hint

13b5cf6

npm run format

c05c1ee

modify tensorboard snapshot

6bfb09a

compatible with previous kfp version. Allow vacant tensorflowImage fi…

65b1d4a

…eld.

add 2 tests for dialog

e8f6644

modify default tensorflow image to 1.13.2

5bed8d8

merge get version and get tensorboard; let --bind_all support tensorb…

2900699

…oard3.x

modify reconciler.go

a8308c7

reconciler rollback

d5a2e15

modify corresponding test for --bind_all

5c5687a

modify requested chances 12/23

0042e1f

formControl sorted alphabetically

a6374ae

k8s-ci-robot removed the needs-ok-to-test label Dec 31, 2019

numerology reviewed Dec 31, 2019

View reviewed changes

k8s-ci-robot assigned gaoning777, jingzhang36 and numerology Jan 3, 2020

numerology reviewed Jan 3, 2020

View reviewed changes

Ark-kun reviewed Jan 4, 2020

View reviewed changes

jingzhang36 reviewed Jan 7, 2020

View reviewed changes

jingzhang36 reviewed Jan 8, 2020

View reviewed changes

dldaisy changed the title ~~Ci samples~~ [Sample] Demonstrate Continuous Integration Jan 8, 2020

add more details in instruction

ced2173

dldaisy added 2 commits January 10, 2020 00:18

add gs:// prefix to bucket_name; solve bugs in variable name

c833b21

modify size to height

70bb0dd

modification

ab6faad

dldaisy mentioned this pull request Jan 13, 2020

[Sample] CI Sample: helloworld #2833

Merged

rsalmond reviewed Feb 3, 2020

View reviewed changes

This was referenced Feb 7, 2020

[Sample] CI Sample: mnist #3013

Merged

[Sample] CI Sample: Kaggle #3021

Merged

stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 24, 2020

Bobgy closed this Jun 24, 2020


		## Overview

		This sample use cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines.

	This sample use cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines.
	This sample uses cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines.

	parser.add_argument('--pipeline_id', help = 'Required. pipeline id',type=str)
	parser.add_argument('--pipeline_id', help='Required. pipeline id', type=str)

	print('your client configuration is :{}'.format(client.pipelines.api_client.configuration.__dict__))
	print('Your client configuration is: {}'.format(client.pipelines.api_client.configuration.__dict__))

[Sample] Demonstrate Continuous Integration #2784

[Sample] Demonstrate Continuous Integration #2784

Conversation

dldaisy commented Dec 30, 2019 • edited by jlewi Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jingzhang36 commented Jan 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jingzhang36 Jan 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jingzhang36 Jan 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jingzhang36 commented Jan 9, 2020

k8s-ci-robot commented Jan 9, 2020

dldaisy commented Jan 12, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Mar 13, 2020

stale bot commented Jun 24, 2020

dldaisy commented Dec 30, 2019 •

edited by jlewi

Loading

jingzhang36 Jan 7, 2020 •

edited

Loading

jingzhang36 Jan 8, 2020 •

edited

Loading