Skip to content

Commit

Permalink
Merge branch 'main' into aman/issue-96
Browse files Browse the repository at this point in the history
  • Loading branch information
amanpruthi authored Mar 5, 2024
2 parents 398b1ff + dddfd4f commit d4c7fd7
Show file tree
Hide file tree
Showing 12 changed files with 150 additions and 11 deletions.
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@

All notable changes to this project will be documented in this file.

## [1.24.0](https://github.com/wandb/terraform-google-wandb/compare/v1.23.3...v1.24.0) (2024-03-04)


### Features

* Add operator helm release ([#98](https://github.com/wandb/terraform-google-wandb/issues/98)) ([e3916a7](https://github.com/wandb/terraform-google-wandb/commit/e3916a76b47ea2afc2cc5b3dfae8b0e0bffd5dd7)), closes [#92](https://github.com/wandb/terraform-google-wandb/issues/92) [#101](https://github.com/wandb/terraform-google-wandb/issues/101) [#101](https://github.com/wandb/terraform-google-wandb/issues/101) [#102](https://github.com/wandb/terraform-google-wandb/issues/102)

### [1.23.3](https://github.com/wandb/terraform-google-wandb/compare/v1.23.2...v1.23.3) (2024-03-01)


Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ resources that lack official modules.
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | ~> 1.0 |
| <a name="requirement_google"></a> [google](#requirement\_google) | ~> 4.82 |
| <a name="requirement_helm"></a> [helm](#requirement\_helm) | ~> 2.10 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | ~> 2.23 |

## Providers
Expand All @@ -78,11 +79,13 @@ No providers.
| <a name="module_gke_app"></a> [gke\_app](#module\_gke\_app) | wandb/wandb/kubernetes | 1.13.0 |
| <a name="module_kms_default_bucket"></a> [kms\_default\_bucket](#module\_kms\_default\_bucket) | ./modules/kms | n/a |
| <a name="module_kms_default_sql"></a> [kms\_default\_sql](#module\_kms\_default\_sql) | ./modules/kms | n/a |

| <a name="module_networking"></a> [networking](#module\_networking) | ./modules/networking | n/a |
| <a name="module_project_factory_project_services"></a> [project\_factory\_project\_services](#module\_project\_factory\_project\_services) | terraform-google-modules/project-factory/google//modules/project_services | ~> 13.0 |
| <a name="module_redis"></a> [redis](#module\_redis) | ./modules/redis | n/a |
| <a name="module_service_accounts"></a> [service\_accounts](#module\_service\_accounts) | ./modules/service_accounts | n/a |
| <a name="module_storage"></a> [storage](#module\_storage) | ./modules/storage | n/a |
| <a name="module_wandb"></a> [wandb](#module\_wandb) | wandb/wandb/helm | 1.2.0 |

## Resources

Expand All @@ -105,6 +108,7 @@ No resources.
| <a name="input_deletion_protection"></a> [deletion\_protection](#input\_deletion\_protection) | If the instance should have deletion protection enabled. The database / Bucket can't be deleted when this value is set to `true`. | `bool` | `true` | no |
| <a name="input_disable_code_saving"></a> [disable\_code\_saving](#input\_disable\_code\_saving) | Boolean indicating if code saving is disabled | `bool` | `false` | no |
| <a name="input_domain_name"></a> [domain\_name](#input\_domain\_name) | Domain for accessing the Weights & Biases UI. | `string` | `null` | no |
| <a name="input_enable_operator"></a> [enable\_operator](#input\_enable\_operator) | Boolean indicating if the new operator should be enabled | `bool` | `false` | no |
| <a name="input_force_ssl"></a> [force\_ssl](#input\_force\_ssl) | Enforce SSL through the usage of the Cloud SQL Proxy (cloudsql://) in the DB connection string | `bool` | `false` | no |
| <a name="input_gke_machine_type"></a> [gke\_machine\_type](#input\_gke\_machine\_type) | Specifies the machine type to be allocated for the database | `string` | `"n1-standard-4"` | no |
| <a name="input_gke_node_count"></a> [gke\_node\_count](#input\_gke\_node\_count) | n/a | `number` | `2` | no |
Expand Down Expand Up @@ -154,4 +158,3 @@ No resources.
| <a name="output_standardized_size"></a> [standardized\_size](#output\_standardized\_size) | n/a |
| <a name="output_url"></a> [url](#output\_url) | The URL to the W&B application |
<!-- END_TF_DOCS -->

9 changes: 9 additions & 0 deletions examples/public-dns-with-cloud-dns/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ provider "kubernetes" {
token = data.google_client_config.current.access_token
}

provider "helm" {
kubernetes {
host = "https://${module.wandb.cluster_endpoint}"
cluster_ca_certificate = base64decode(module.wandb.cluster_ca_certificate)
token = data.google_client_config.current.access_token
}
}

# Spin up all required services
module "wandb" {
source = "../../"
Expand All @@ -33,6 +41,7 @@ module "wandb" {
wandb_version = var.wandb_version
wandb_image = var.wandb_image


create_redis = var.create_redis
use_internal_queue = true
force_ssl = var.force_ssl
Expand Down
94 changes: 87 additions & 7 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -166,9 +166,9 @@ module "database" {
}

module "redis" {
count = var.create_redis ? 1 : 0
source = "./modules/redis"
namespace = var.namespace
count = var.create_redis ? 1 : 0
source = "./modules/redis"
namespace = var.namespace
### here we set the default to 6gb, which is = setting for "small" standard size
memory_size_gb = coalesce(try(local.deployment_size[var.size].cache, 6))
network = local.network
Expand All @@ -190,7 +190,7 @@ locals {

module "gke_app" {
source = "wandb/wandb/kubernetes"
version = "1.13.0"
version = "1.14.1"

license = var.license

Expand All @@ -208,11 +208,13 @@ module "gke_app" {
local_restore = var.local_restore
other_wandb_env = merge({
"GORILLA_DISABLE_CODE_SAVING" = var.disable_code_saving,
"GORILLA_CUSTOMER_SECRET_STORE_SOURCE" = local.secret_store_source
"GORILLA_CUSTOMER_SECRET_STORE_SOURCE" = local.secret_store_source,
"GORILLA_GLUE_LIST" = var.enable_operator
}, var.other_wandb_env)

wandb_image = var.wandb_image
wandb_version = var.wandb_version
wandb_image = var.wandb_image
wandb_version = var.wandb_version
wandb_replicas = var.enable_operator ? 0 : 1

resource_limits = var.resource_limits
resource_requests = var.resource_requests
Expand All @@ -226,3 +228,81 @@ module "gke_app" {
module.app_gke
]
}

module "wandb" {
source = "wandb/wandb/helm"
version = "1.2.0"

spec = {
values = {
global = {
host = local.url
license = var.license

extraEnv = merge({
"GORILLA_DISABLE_CODE_SAVING" = var.disable_code_saving,
"GORILLA_CUSTOMER_SECRET_STORE_SOURCE" = local.secret_store_source,
"TAG_CUSTOMER_NS" = var.namespace
"OIDC_ISSUER" = var.oidc_issuer
"OIDC_CLIENT_ID" = var.oidc_client_id
"OIDC_AUTH_METHOD" = var.oidc_auth_method
}, var.other_wandb_env)

bucket = {
provider = "gcs"
name = local.bucket
}

mysql = {
name = module.database.database_name
user = module.database.username
password = module.database.password
database = module.database.database_name
host = module.database.private_ip_address
port = 3306
}

redis = var.create_redis ? {
password = module.redis.0.auth_string
host = module.redis.0.host
port = module.redis.0.port
caCert = module.redis.0.ca_cert
params = {
tls = true
ttlInSeconds = 604800
caCertPath = "/etc/ssl/certs/redis_ca.pem"
}
} : null
}

app = {
extraEnvs = {
"GORILLA_GLUE_LIST" = !var.enable_operator
}
}

ingress = {
annotations = {
"kubernetes.io/ingress.class" = "gce"
"kubernetes.io/ingress.global-static-ip-name" = module.app_lb.address_operator_name
"ingress.gcp.kubernetes.io/pre-shared-cert" = module.app_lb.certificate
}
}

redis = { install = false }
mysql = { install = false }
# weave = { install = false }
}
}

operator_chart_version = "1.1.0"
controller_image_tag = "1.10.1"

# Added `depends_on` to ensure old infrastructure is provisioned first. This addresses a critical scheduling challenge
# where the Datadog DaemonSet could fail to provision due to CPU constraints. Ensuring the old infrastructure has priority
# mitigates the risk of "insufficient CPU" errors by facilitating controlled pod scheduling across nodes.
# TODO: Remove `depends_on` for phase 3
depends_on = [
module.gke_app
]
}
3 changes: 3 additions & 0 deletions modules/app_lb/https/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
output "certificate" {
value = google_compute_managed_ssl_certificate.default.name
}
4 changes: 4 additions & 0 deletions modules/app_lb/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@ resource "google_compute_global_address" "default" {
name = "${var.namespace}-address"
}

resource "google_compute_global_address" "operator" {
name = "${var.namespace}-operator-address"
}

# Create a URL map that points to the GKE service
module "url_map" {
source = "./url_map"
Expand Down
14 changes: 13 additions & 1 deletion modules/app_lb/outputs.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
output "address" {
value = google_compute_global_address.default.address
}
}

output "address_operator" {
value = google_compute_global_address.operator.address
}

output "address_operator_name" {
value = google_compute_global_address.operator.name
}

output "certificate" {
value = module.https[0].certificate
}
7 changes: 7 additions & 0 deletions modules/redis/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,10 @@ output "auth_string" {
value = google_redis_instance.default.auth_string
}

output "host" {
value = google_redis_instance.default.host
}

output "port" {
value = google_redis_instance.default.port
}
4 changes: 4 additions & 0 deletions modules/storage/bucket/outputs.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
output "bucket_name" {
value = google_storage_bucket.file_storage.name
}

output "bucket_region" {
value = google_storage_bucket.file_storage.location
}
2 changes: 1 addition & 1 deletion outputs.tf
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
output "address" {
value = module.app_lb.address
value = var.enable_operator ? module.app_lb.address_operator : module.app_lb.address
}
output "bucket_name" {
value = local.bucket
Expand Down
6 changes: 6 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -263,3 +263,9 @@ variable "size" {
type = string
default = null
}

variable "enable_operator" {
type = bool
description = "Boolean indicating if the new operator should be enabled"
default = false
}
6 changes: 5 additions & 1 deletion versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,9 @@ terraform {
source = "hashicorp/kubernetes"
version = "~> 2.23"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.10"
}
}
}
}

0 comments on commit d4c7fd7

Please sign in to comment.