Welcome! This is an example project for demonstrating how to build an extensible, production-ready project on Vertex AI for image use cases. It is built specifically for the Mineral team but has some ideas that might be useful to share.
- Data labelling
- Data processing
- Training
- Workspace
Here is where you create your datasets and training pipelines.
- Pipelines as extensible Python classes
- Reusable pipeline factories
- Automated training via Cloud Build
MLOps, such as model monitoring, feature store, etc
Cloud Build packages your code and runs it on Google Cloud servers.
Note: Instead of using gcloud
, you can do all the below using the Google Cloud web console UI or even with client libraries.
The file to determine what commands are run are in the .cloud-build/run_pipelines_cloudbuild.yaml
file.
To run this on the cloud, run the following command:
gcloud builds submit --config=.cloud-build/run_pipelines_cloudbuild.yaml --substitutions=_GCP_BUCKET_NAME="gs://MY_BUCKET_NAME/pipeline_staging"
This will error out as you haven't given the correct permissions to the service account yet. To find out the service account, go the the Cloud Build logs listed in the terminal output for the command and find it under the "EXECUTION DETAILS" tab.
Alternatively, this command may give you the correct one:
gcloud projects get-iam-policy MY_PROJECT | grep cloudbuild
2. The Cloud Build service account needs access to the aiplatform.pipelineJobs.create
permission. Grant it access to the roles/aiplatform.user
role which has this permission.
gcloud projects add-iam-policy-binding MY_PROJECT \
--member='serviceAccount:MY_SERVICE_ACCOUNT@cloudbuild.gserviceaccount.com' \
--role='roles/aiplatform.user'
3. Also grant it the ability to use another service account by giving it the roles/iam.serviceAccountUser
role:
gcloud projects add-iam-policy-binding MY_PROJECT \
--member='serviceAccount:MY_SERVICE_ACCOUNT@cloudbuild.gserviceaccount.com' \
--role='roles/iam.serviceAccountUser'
gcloud builds submit --config=.cloud-build/run_pipelines_cloudbuild.yaml --substitutions=_GCP_BUCKET_NAME="gs://MY_BUCKET_NAME/pipeline_staging"
Instead of running the Cloud Build command from your local machine, you can set up triggers to automatically run your Cloud Build jobs.
For this repo, you might want to retrain all pipelines on the following conditions:
- Scheduled weekly
- Whenever someone creates a pull-request on your repositoy
- Whenever code is merged to a certain branch
Here are some examples:
gcloud beta builds triggers create github --name="run-pipelines" \
--repo-name vertex-ai-project-example \
--repo-owner ivanmkc \
--branch-pattern="^main$" \
--build-config .cloud-build/run_pipelines_cloudbuild.yaml \
--substitutions=_GCP_BUCKET_NAME="gs://MY_BUCKET_NAME/pipeline_staging"
Also see this for more options: https://cloud.google.com/sdk/gcloud/reference/beta/builds/triggers/create/github For more details, see: https://cloud.google.com/build/docs/automating-builds/build-repos-from-github
There doesn't seem to be a way to do this without using the Google Cloud web console UI, so please see instructions here: https://cloud.google.com/build/docs/automating-builds/create-scheduled-triggers