Skip to content

Commit

Permalink
feat: Enhance CI/CD (Build, E2E, Composite Action) with 1ES Migration…
Browse files Browse the repository at this point in the history
… and Phi-3 Integration (#428)

**Reason for Change**:
This PR migrates image build process 1ES.

It involves creating new pipeline for 1ES. Both 1ES and non-1ES
pipelines now use a shared composite action for building images.

Both pipelines also only use one ACR. Non-1ES pipeline uses same ACR it
builds to for E2E test. 1ES pipeline does not perform E2E test as its
tenant doesn't have access to resources to perform E2E test.

Finally this PR also introduces phi-3 images for testing new pipeline.

---------

Signed-off-by: Ishaan Sehgal <ishaanforthewin@gmail.com>
  • Loading branch information
ishaansehgal99 authored Jun 12, 2024
1 parent 4db58ce commit d38d87b
Show file tree
Hide file tree
Showing 15 changed files with 610 additions and 576 deletions.
162 changes: 162 additions & 0 deletions .github/actions/build-image-action/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
name: preset-common-image-build

description: "A reusable workflow for building preset images"

inputs:
weights_dir:
description: "The directory for weights"
required: true
branch_name:
description: "Branch name"
required: true
image_name:
description: "Image name"
required: true
image_tag:
description: "Image tag"
required: true
acr_name:
description: "ACR name"
required: true
acr_username:
description: "ACR username"
required: true
acr_password:
description: "ACR password"
required: true
model_name:
description: "Model name"
required: true
model_type:
description: "Model type"
required: true
model_version:
description: "Model version"
required: true
model_runtime:
description: "Model runtime"
required: true
runs_on:
description: "The runner to use"
required: true

runs:
using: "composite"
steps:
- name: Checkout
uses: actions/checkout@v4.1.6
with:
submodules: true
fetch-depth: 0

- name: Install Azure CLI latest
run: |
if ! which az > /dev/null; then
echo "Azure CLI not found. Installing..."
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
else
echo "Azure CLI already installed."
fi
shell: bash

- name: Check Available Disk Space
run: |
echo "Initial disk usage:"
df -h
# Remove unused Docker resources
docker system prune -a -f --volumes
# Check Docker-related disk usage after cleanup
echo "Docker-related disk usage after cleanup:"
docker system df
# Check final disk usage
echo "Final disk usage:"
df -h
shell: bash

- name: Check and Create Kind Cluster
run: |
if ! kind get clusters | grep -q kind; then
echo "Creating directory for etcd storage"
sudo mkdir -p /mnt/storage/etcd
echo "Creating Kind cluster using kind-1es.yaml"
if ! kind create cluster --config .github/workflows/kind-cluster/kind-1es.yaml; then
echo "Failed to create the Kind cluster."
exit 1
fi
else
echo "Kind cluster already exists."
fi
kubectl cluster-info --context kind-kind
shell: bash

- name: Check if Image exists in target ACR
id: check_test_image
run: |
ACR_NAME=${{ inputs.acr_name }}
IMAGE_NAME=${{ inputs.image_name }}
TAG=${{ inputs.image_tag }}
# Use '|| true' to prevent script from exiting with an error if the repository is not found
TAGS=$(az acr repository show-tags -n $ACR_NAME --repository $IMAGE_NAME --output tsv || true)
if [[ -z "$TAGS" ]]; then
echo "Image $IMAGE_NAME:$TAG or repository not found in $ACR_NAME."
echo "IMAGE_EXISTS=false" >> $GITHUB_OUTPUT
else
if echo "$TAGS" | grep -q "^$TAG$"; then
echo "IMAGE_EXISTS=true" >> $GITHUB_OUTPUT
else
echo "IMAGE_EXISTS=false" >> $GITHUB_OUTPUT
echo "Image $IMAGE_NAME:$TAG not found in $ACR_NAME."
fi
fi
shell: bash

- name: Launch Python Script to Kickoff Build Jobs
if: steps.check_test_image.outputs.IMAGE_EXISTS == 'false'
id: launch_script
run: |
PR_BRANCH=${{ inputs.branch_name }} \
ACR_NAME=${{ inputs.acr_name }} \
ACR_USERNAME=${{ inputs.acr_username }} \
ACR_PASSWORD=${{ inputs.acr_password }} \
MODEL_NAME=${{ inputs.model_name }} \
MODEL_TYPE=${{ inputs.model_type }} \
MODEL_VERSION=${{ inputs.model_version }} \
MODEL_RUNTIME=${{ inputs.model_runtime }} \
MODEL_TAG=${{ inputs.image_tag }} \
WEIGHTS_DIR=${{ inputs.weights_dir }} \
python3 .github/workflows/kind-cluster/main.py
shell: bash

- name: Check Python Script Status
if: ${{ always() }}
run: |
if [[ "${{ steps.check_test_image.outputs.IMAGE_EXISTS }}" == "true" ]]; then
echo "Image already exists; skipping the status step."
elif [[ "${{ steps.launch_script.outcome }}" != "success" ]]; then
echo "Python script failed to execute successfully."
exit 1 # Fail the job due to script failure
else
echo "Python script executed successfully."
fi
shell: bash

- name: Cleanup
if: ${{ always() }}
run: |
echo "Cleaning up kindnet resources..."
kubectl delete clusterrole kindnet --ignore-not-found || true
kubectl delete clusterrolebinding kindnet --ignore-not-found || true
kubectl delete serviceaccount kindnet -n kube-system --ignore-not-found || true
kubectl delete daemonset kindnet -n kube-system --ignore-not-found || true
kind delete cluster
if [[ "${{ inputs.image_already_built }}" == "false" ]]; then
kubectl get job --no-headers -o custom-columns=":metadata.name" | grep "^docker-build-job-${{ inputs.model_name }}-[0-9]" | xargs -r kubectl delete job
fi
shell: bash
Loading

0 comments on commit d38d87b

Please sign in to comment.