-
Notifications
You must be signed in to change notification settings - Fork 70
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: quick deploy using terraform (#634)
**Reason for Change**: <!-- What does this PR improve or fix in Kaito? Why is it needed? --> Provision AKS cluster and deploy KAITO using Terraform **Notes for Reviewers**: See ./terraform/README.md for more information on how to deploy
- Loading branch information
Showing
11 changed files
with
371 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Local .terraform directories | ||
**/.terraform/* | ||
|
||
# .tfstate files | ||
*.tfstate | ||
*.tfstate.* | ||
|
||
# Crash log files | ||
crash.log | ||
crash.*.log | ||
|
||
# Exclude all .tfvars files, which are likely to contain sensitive data, such as | ||
# password, private keys, and other secrets. These should not be part of version | ||
# control as they are data points which are potentially sensitive and subject | ||
# to change depending on the environment. | ||
*.tfvars | ||
*.tfvars.json | ||
|
||
# Ignore override files as they are usually used to override resources locally and so | ||
# are not checked in | ||
override.tf | ||
override.tf.json | ||
*_override.tf | ||
*_override.tf.json | ||
|
||
# Include override files you do wish to add to version control using negated pattern | ||
# !example_override.tf | ||
|
||
# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan | ||
# example: *tfplan* | ||
|
||
# Ignore CLI configuration files | ||
.terraformrc | ||
terraform.rc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
# Deploy KAITO on AKS using Terraform | ||
|
||
This is a sample of how to deploy an Open Source KAITO on a new Azure Kubernetes Service (AKS) using Terraform. This sample will deploy the following resources: | ||
|
||
- Azure Kubernetes Service (AKS) | ||
- Azure Container Registry (ACR) with short lived, repo scoped token | ||
- Azure Managed Identity with Federated Credential and Role Assignment for GPU Provisioner | ||
- Install the KAITO GPU Provisioner Helm Chart | ||
- Install the KAITO Workspace Helm Chart | ||
- Kubernetes Secret for the ACR token | ||
|
||
## Prerequisites | ||
|
||
- Terraform 1.9.7 or later | ||
- Azure CLI 2.65.0 or later | ||
- kubectl 1.30.5 or later | ||
- Helm 3.16.2 or later | ||
|
||
## Setup | ||
|
||
To deploy this sample, you will to use the Azure CLI to login to your Azure account and set the subscription you want to use, then use the Terraform CLI to provision the Azure resources and execute the Helm installations for the KAITO operators. | ||
|
||
Login to your Azure account and set the subscription you want to use. | ||
|
||
```bash | ||
az login | ||
az account set -s <subscription-id> | ||
``` | ||
|
||
Export the subscription ID for Terraform to use. | ||
|
||
```bash | ||
export ARM_SUBSCRIPTION_ID=$(az account show --query id -o tsv) | ||
``` | ||
|
||
Initialize the Terraform providers. | ||
|
||
```bash | ||
terraform init | ||
``` | ||
|
||
> [!NOTE] | ||
> The following variables in the [variables.tf](./variables.tf) file are available for customization: | ||
> | ||
> - `location` - The Azure region to deploy the resources. Be sure you have the necessary quota in the region. | ||
> - `kaito_gpu_provisioner_version` - The version of the KAITO GPU Provisioner. | ||
> - `kaito_workspace_version` - The version of the KAITO Workspace. | ||
Run the Terraform apply command and enter `yes` when prompted to deploy the Azure resources. | ||
|
||
```bash | ||
terraform apply | ||
``` | ||
|
||
Log into the AKS cluster. | ||
|
||
```bash | ||
az aks get-credentials -g $(terraform output -raw rg_name) -n $(terraform output -raw aks_name) | ||
``` | ||
|
||
Verify installation of the KAITO operators. | ||
|
||
```bash | ||
helm list -n gpu-provisioner | ||
helm list -n kaito-workspace | ||
``` | ||
|
||
Check status of the KAITO pods. | ||
|
||
```bash | ||
kubectl get po -n gpu-provisioner | ||
kubectl get po -n kaito-workspace | ||
``` | ||
|
||
## Cleanup | ||
|
||
Run the Terraform destroy command and enter `yes` when prompted to delete the Azure resources. | ||
|
||
```bash | ||
terraform destroy | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
controller: | ||
env: | ||
- name: ARM_SUBSCRIPTION_ID | ||
value: ${AZURE_SUBSCRIPTION_ID} | ||
- name: LOCATION | ||
value: ${LOCATION} | ||
- name: AZURE_CLUSTER_NAME | ||
value: ${AKS_NAME} | ||
- name: AZURE_NODE_RESOURCE_GROUP | ||
value: ${AKS_NRG_NAME} | ||
- name: ARM_RESOURCE_GROUP | ||
value: ${RG_NAME} | ||
- name: LEADER_ELECT | ||
value: "false" | ||
workloadIdentity: | ||
clientId: ${KAITO_IDENTITY_CLIENT_ID} | ||
tenantId: ${AZURE_TENANT_ID} | ||
settings: | ||
azure: | ||
clusterName: ${AKS_NAME} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
# Create managed identity that the gpu-provisioner will use to interact with Azure | ||
resource "azurerm_user_assigned_identity" "kaito" { | ||
resource_group_name = azurerm_resource_group.example.name | ||
location = azurerm_resource_group.example.location | ||
name = "kaitoprovisioner" | ||
} | ||
|
||
# Grant the managed identity the Contributor role to create new AKS nodes | ||
resource "azurerm_role_assignment" "kaito_aks_contributor" { | ||
principal_id = azurerm_user_assigned_identity.kaito.principal_id | ||
scope = azurerm_kubernetes_cluster.example.id | ||
role_definition_name = "Contributor" | ||
skip_service_principal_aad_check = true | ||
} | ||
|
||
# Create a federated identity credential for the managed identity to be used by the gpu-provisioner via workload identity | ||
resource "azurerm_federated_identity_credential" "kaito" { | ||
resource_group_name = azurerm_resource_group.example.name | ||
parent_id = azurerm_user_assigned_identity.kaito.id | ||
name = "kaitoprovisioner" | ||
issuer = azurerm_kubernetes_cluster.example.oidc_issuer_url | ||
audience = ["api://AzureADTokenExchange"] | ||
subject = "system:serviceaccount:gpu-provisioner:gpu-provisioner" | ||
} | ||
|
||
# Install the gpu-provisioner chart | ||
resource "helm_release" "gpu_provisioner" { | ||
name = "gpu-provisioner" | ||
chart = "https://raw.githubusercontent.com/Azure/kaito/refs/heads/gh-pages/charts/kaito/gpu-provisioner-${var.kaito_gpu_provisioner_version}.tgz" | ||
namespace = "gpu-provisioner" | ||
create_namespace = true | ||
|
||
values = [ | ||
templatefile("${path.module}/gpu-provisioner-values.tmpl", | ||
{ | ||
AZURE_TENANT_ID = data.azurerm_client_config.current.tenant_id | ||
AZURE_SUBSCRIPTION_ID = data.azurerm_client_config.current.subscription_id | ||
RG_NAME = azurerm_resource_group.example.name | ||
LOCATION = azurerm_resource_group.example.location | ||
AKS_NAME = azurerm_kubernetes_cluster.example.name | ||
AKS_NRG_NAME = azurerm_kubernetes_cluster.example.node_resource_group | ||
KAITO_IDENTITY_CLIENT_ID = azurerm_user_assigned_identity.kaito.client_id | ||
} | ||
) | ||
] | ||
} | ||
|
||
# Install the kaito-workspace chart | ||
resource "helm_release" "kaito_workspace" { | ||
name = "kaito-workspace" | ||
chart = "https://raw.githubusercontent.com/Azure/kaito/refs/heads/gh-pages/charts/kaito/workspace-${var.kaito_workspace_version}.tgz" | ||
namespace = "kaito-workspace" | ||
create_namespace = true | ||
} | ||
|
||
# Create a secret to store the Azure Container Registry credentials for the workspace to refer to when pushing and pulling images from the registry | ||
resource "kubernetes_secret" "example" { | ||
metadata { | ||
name = "myregistrysecret" | ||
} | ||
|
||
type = "kubernetes.io/dockerconfigjson" | ||
|
||
data = { | ||
".dockerconfigjson" = jsonencode({ | ||
auths = { | ||
"${azurerm_container_registry.example.login_server}" = { | ||
"username" = azurerm_container_registry_token.example.name | ||
"password" = azurerm_container_registry_token_password.example.password1 | ||
} | ||
} | ||
}) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
resource "azurerm_kubernetes_cluster" "example" { | ||
resource_group_name = azurerm_resource_group.example.name | ||
location = azurerm_resource_group.example.location | ||
name = "aks-${local.random_name}" | ||
dns_prefix = "aks-${local.random_name}" | ||
oidc_issuer_enabled = true | ||
workload_identity_enabled = true | ||
|
||
default_node_pool { | ||
name = "default" | ||
node_count = 1 | ||
vm_size = "Standard_D2_v2" | ||
|
||
upgrade_settings { | ||
drain_timeout_in_minutes = 0 | ||
max_surge = "10%" | ||
node_soak_duration_in_minutes = 0 | ||
} | ||
} | ||
|
||
identity { | ||
type = "SystemAssigned" | ||
} | ||
} | ||
|
||
resource "azurerm_role_assignment" "aks_acr_pull" { | ||
principal_id = azurerm_kubernetes_cluster.example.kubelet_identity[0].object_id | ||
scope = azurerm_container_registry.example.id | ||
role_definition_name = "AcrPull" | ||
skip_service_principal_aad_check = true | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
locals { | ||
random_name = "kaitodemo${random_integer.example.result}" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
terraform { | ||
required_providers { | ||
azurerm = { | ||
source = "hashicorp/azurerm" | ||
version = "=4.5.0" | ||
} | ||
|
||
random = { | ||
source = "hashicorp/random" | ||
version = "=3.6.3" | ||
} | ||
|
||
kubernetes = { | ||
source = "hashicorp/kubernetes" | ||
version = "=2.33.0" | ||
} | ||
|
||
helm = { | ||
source = "hashicorp/helm" | ||
version = "=2.16.1" | ||
} | ||
} | ||
} | ||
|
||
provider "azurerm" { | ||
features { | ||
resource_group { | ||
prevent_deletion_if_contains_resources = false | ||
} | ||
} | ||
} | ||
|
||
provider "kubernetes" { | ||
host = azurerm_kubernetes_cluster.example.kube_config.0.host | ||
username = azurerm_kubernetes_cluster.example.kube_config.0.username | ||
password = azurerm_kubernetes_cluster.example.kube_config.0.password | ||
client_certificate = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.client_certificate) | ||
client_key = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.client_key) | ||
cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.cluster_ca_certificate) | ||
} | ||
|
||
provider "helm" { | ||
kubernetes { | ||
host = azurerm_kubernetes_cluster.example.kube_config.0.host | ||
username = azurerm_kubernetes_cluster.example.kube_config.0.username | ||
password = azurerm_kubernetes_cluster.example.kube_config.0.password | ||
client_certificate = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.client_certificate) | ||
client_key = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.client_key) | ||
cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.cluster_ca_certificate) | ||
} | ||
} | ||
|
||
data "azurerm_client_config" "current" {} | ||
|
||
resource "random_integer" "example" { | ||
min = 10 | ||
max = 99 | ||
} | ||
|
||
resource "azurerm_resource_group" "example" { | ||
name = "rg-${local.random_name}" | ||
location = var.location | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
output "rg_name" { | ||
value = azurerm_resource_group.example.name | ||
} | ||
|
||
output "aks_name" { | ||
value = azurerm_kubernetes_cluster.example.name | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
resource "azurerm_container_registry" "example" { | ||
resource_group_name = azurerm_resource_group.example.name | ||
location = azurerm_resource_group.example.location | ||
name = "acr${local.random_name}" | ||
sku = "Standard" | ||
admin_enabled = false | ||
anonymous_pull_enabled = false | ||
} | ||
|
||
resource "azurerm_container_registry_scope_map" "example" { | ||
name = "default" | ||
container_registry_name = azurerm_container_registry.example.name | ||
resource_group_name = azurerm_resource_group.example.name | ||
|
||
actions = [ | ||
"repositories/${var.registry_repository_name}/content/read", | ||
"repositories/${var.registry_repository_name}/content/write" | ||
] | ||
} | ||
|
||
resource "azurerm_container_registry_token" "example" { | ||
name = "default" | ||
container_registry_name = azurerm_container_registry.example.name | ||
resource_group_name = azurerm_resource_group.example.name | ||
scope_map_id = azurerm_container_registry_scope_map.example.id | ||
} | ||
|
||
resource "azurerm_container_registry_token_password" "example" { | ||
container_registry_token_id = azurerm_container_registry_token.example.id | ||
|
||
password1 { | ||
expiry = timeadd(timestamp(), "168h") # 7 days | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
variable "location" { | ||
type = string | ||
default = "brazilsouth" | ||
description = "value of location" | ||
} | ||
|
||
variable "kaito_gpu_provisioner_version" { | ||
type = string | ||
default = "0.2.0" | ||
description = "kaito gpu provisioner version" | ||
} | ||
|
||
variable "kaito_workspace_version" { | ||
type = string | ||
default = "0.3.1" | ||
description = "kaito workspace version" | ||
} | ||
|
||
variable "registry_repository_name" { | ||
type = string | ||
default = "fine-tuned-adapters/kubernetes" | ||
description = "container registry repository name" | ||
} |