-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sample] Demonstrate Continuous Integration #2784
Conversation
|
||
## Overview | ||
|
||
This sample use cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sample use cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines. | |
This sample uses cloud build to implement the continuous integration process of a simple pipeline that outputs "hello world" to the console. Once all set up, you can push your code to github repo, then the build process in cloud build will be triggered automatically, then a run will be created in kubeflow pipeline. You can view your pipeline and the run in kubeflow pipelines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
import argparse | ||
|
||
parser = argparse.ArgumentParser() | ||
parser.add_argument('--commit_id', help='Commit Id', type=str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe elaborate more in the help
field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
train = dsl.ContainerOp( | ||
name='mnist train', | ||
image = os.path.join(gcr_address, 'mnist_train:', args.commit_id) | ||
).apply(use_gcp_secret('user-gcp-sa')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is user-gcp-sa
secret still valid under workload identity based deployment? @Bobgy
I remember the usage of user-gcp-sa
in samples has been cleaned up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it won't be used.
Recommend removing its usage and put a link to https://www.kubeflow.org/docs/gke/authentication-pipelines/ nearby about how to authenticate to GCP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it also mean that we don't need to create a 'user-gcp-sa' secret volumn mounted on the cluster anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we won't need to. Use workload identity if we care about security, and use full scope cluster for convenience.
/assign @jingzhang36 |
parser = argparse.ArgumentParser() | ||
parser.add_argument('--version_name', help='Required. Name of the new version. Must be unique.', type=str) | ||
parser.add_argument('--package_url', help='Required. pipeline package url', type=str) | ||
parser.add_argument('--pipeline_id', help = 'Required. pipeline id',type=str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consistent style.
parser.add_argument('--pipeline_id', help = 'Required. pipeline id',type=str) | |
parser.add_argument('--pipeline_id', help='Required. pipeline id', type=str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
else: | ||
client = kfp.Client() | ||
|
||
print('your client configuration is :{}'.format(client.pipelines.api_client.configuration.__dict__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print('your client configuration is :{}'.format(client.pipelines.api_client.configuration.__dict__)) | |
print('Your client configuration is: {}'.format(client.pipelines.api_client.configuration.__dict__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
|
||
print('your client configuration is :{}'.format(client.pipelines.api_client.configuration.__dict__)) | ||
print('Now in create_pipeline_version_and_run.py...') | ||
print('your api_client host is:') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe also use .format()
as above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will modify this part.
gcr_address: str | ||
): | ||
import os | ||
train = dsl.ContainerOp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try using an actual component instead of ad-hoc ContainerOp:
component.yaml:
name: Train on MNIST
implementation:
container:
image: helloworld-ci
and then
train_op = kfp.components.load_component_from_file('component.yaml')
...
train = train_op()
in cloudbuild.yaml we can replace the image name in the component.yaml
with the newly-build image:
sed "s|image: helloworld-ci|image: ${_GCR_PATH}/helloworld-ci:$COMMIT_SHA|"
bucket_name: str | ||
): | ||
import os | ||
stepDownloadData = dsl.ContainerOp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Let's componentize all these ops.
Each component will go into its own component.yaml
file and the build script will replace the image versions inside those files, not in the pipeline.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Will do.
## Usage | ||
|
||
* Substitute the constants in cloudbuild.yaml | ||
* Fill in your kaggle_username and kaggle_key in Dockerfiles to authenticate to kaggle. You can get them from an API token created from your kaggle account page. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably specify which Dockerfile, e.g., Dockerfile under download_dataset, since there are multiple Dockerfiles in kaggle-ci-sample
Or if this replacement is needed in all Dockerfiles, please also point it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
|
||
|
||
substitutions: | ||
_CODE_PATH: /workspace/kaggle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the _CODE_PATH here should be something like "/workspace/samples/contrib/versioned-pipeline-ci-samples/kaggle-ci-sample" to match dockerfile locations. Please fix this one and other _CODE_PATH accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Will modify them.
test_blob.upload_from_filename('test.csv') | ||
|
||
with open('train.txt', 'w') as f: | ||
f.write('gs://'+bucket_name+'/train.csv') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want those datas sets in a public available place instead of asking users to prepare the data sets. E.g., gs://ml-pipeline-playground/shakespeare1.txt in pipeline "[Sample] Basic - Parallel execution". This way, we don't need the download data step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sample maybe a little different from other samples. If put the data in a public gs bucket, user may not get the full process of completing a kaggle competition, from download, train, to submit.
"${_CODE_PATH}/download_dataset/Dockerfile", | ||
] | ||
id: "BuildDownloadDataImage" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to push image after we build it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the final step will do.
df_train = pd.read_csv(train_file_path) | ||
sns.set() | ||
cols = ['SalePrice', 'OverallQual', 'GrLivArea', 'GarageCars', 'TotalBsmtSF', 'FullBath', 'YearBuilt'] | ||
sns.pairplot(df_train[cols], size = 2.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it seems that the latest interface uses 'height' instead of 'size'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is. Will modify it.
Also, would it be better to break this pr into several smaller ones? E.g., each example has its own PR? The review and merge could be easier and faster that way. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ok. I think I can break them to small pieces of pr. |
parser.add_argument('--package_url', help='Required. pipeline package url', type=str) | ||
parser.add_argument('--pipeline_id', help = 'Required. pipeline id',type=str) | ||
parser.add_argument('--gcr_address', help='Required. Your cloud registry path. For example, gcr.io/my-project', type=str) | ||
parser.add_argument('--host', help='Host address of kfp.Client. Will be get from cluster automatically, type=str, default='') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please forgive comments from passerby, I noticed this arg is missing a closing '
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reply! Will modify it in new pull request.
@dldaisy: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Changes:
This change is