Skip to content

Commit

Permalink
Update monitoring terraform module to support multiple albs (saml-aut…
Browse files Browse the repository at this point in the history
…h-proxy support) (#17631)
  • Loading branch information
rfairburn authored Mar 14, 2024
1 parent 5349403 commit c10c75c
Show file tree
Hide file tree
Showing 6 changed files with 128 additions and 56 deletions.
19 changes: 13 additions & 6 deletions terraform/addons/monitoring/.header.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,23 +16,30 @@ Some of the for_each and counts in this module cannot pre-determine the numbers

You will need to `terraform apply -target module.main` prior applying monitoring assuming the use of a configuration matching the example at https://github.com/fleetdm/fleet/blob/main/terraform/example/main.tf.

Multiple alb support was added in order to allow monitoring `saml-auth-proxy`. See https://github.com/fleetdm/fleet/tree/main/terraform/addons/saml-auth-proxy

# Example configuration

This assumes your fleet module is `main` and is configured with it's default documentation.

See https://github.com/fleetdm/fleet/blob/main/terraform/example/main.tf for details.
https://github.com/fleetdm/fleet/blob/main/terraform/example/main.tf for details.


```
module "monitoring" {
source = "github.com/fleetdm/fleet//terraform/addons/monitoring?ref=tf-mod-addon-monitoring-v1.1.0"
customer_prefix = local.customer
fleet_ecs_service_name = module.main.byo-vpc.byo-db.byo-ecs.service.name
fleet_min_containers = module.main.byo-vpc.byo-db.byo-ecs.service.desired_count
alb_name = module.main.byo-vpc.byo-db.alb.lb_dns_name
alb_target_group_name = module.main.byo-vpc.byo-db.alb.target_group_names[0]
alb_target_group_arn_suffix = module.main.byo-vpc.byo-db.alb.target_group_arn_suffixes[0]
alb_arn_suffix = module.main.byo-vpc.byo-db.alb.lb_arn_suffix
albs = [
{
name = module.main.byo-vpc.byo-db.alb.lb_dns_name,
target_group_name = module.main.byo-vpc.byo-db.alb.target_group_names[0]
target_group_arn_suffix = module.main.byo-vpc.byo-db.alb.target_group_arn_suffixes[0]
arn_suffix = module.main.byo-vpc.byo-db.alb.lb_arn_suffix
ecs_service_name = module.main.byo-vpc.byo-db.byo-ecs.service.name
min_containers = module.main.byo-vpc.byo-db.byo-ecs.appautoscaling_target.min_capacity
},
]
# Only publish alerts for items in this map
sns_topic_arns_map = {
alb_httpcode_5xx = [var.sns_topic_arn]
Expand Down
27 changes: 15 additions & 12 deletions terraform/addons/monitoring/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,29 @@ Some of the for\_each and counts in this module cannot pre-determine the numbers

You will need to `terraform apply -target module.main` prior applying monitoring assuming the use of a configuration matching the example at https://github.com/fleetdm/fleet/blob/main/terraform/example/main.tf.

Multiple alb support was added in order to allow monitoring `saml-auth-proxy`. See https://github.com/fleetdm/fleet/tree/main/terraform/addons/saml-auth-proxy

# Example configuration

This assumes your fleet module is `main` and is configured with it's default documentation.

See https://github.com/fleetdm/fleet/blob/main/terraform/example/main.tf for details.
https://github.com/fleetdm/fleet/blob/main/terraform/example/main.tf for details.

```
module "monitoring" {
source = "github.com/fleetdm/fleet//terraform/addons/monitoring?ref=tf-mod-addon-monitoring-v1.1.0"
customer_prefix = local.customer
fleet_ecs_service_name = module.main.byo-vpc.byo-db.byo-ecs.service.name
fleet_min_containers = module.main.byo-vpc.byo-db.byo-ecs.service.desired_count
alb_name = module.main.byo-vpc.byo-db.alb.lb_dns_name
alb_target_group_name = module.main.byo-vpc.byo-db.alb.target_group_names[0]
alb_target_group_arn_suffix = module.main.byo-vpc.byo-db.alb.target_group_arn_suffixes[0]
alb_arn_suffix = module.main.byo-vpc.byo-db.alb.lb_arn_suffix
albs = [
{
name = module.main.byo-vpc.byo-db.alb.lb_dns_name,
target_group_name = module.main.byo-vpc.byo-db.alb.target_group_names[0]
target_group_arn_suffix = module.main.byo-vpc.byo-db.alb.target_group_arn_suffixes[0]
arn_suffix = module.main.byo-vpc.byo-db.alb.lb_arn_suffix
ecs_service_name = module.main.byo-vpc.byo-db.byo-ecs.service.name
min_containers = module.main.byo-vpc.byo-db.byo-ecs.appautoscaling_target.min_capacity
},
]
# Only publish alerts for items in this map
sns_topic_arns_map = {
alb_httpcode_5xx = [var.sns_topic_arn]
Expand Down Expand Up @@ -131,15 +138,11 @@ No modules.
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_acm_certificate_arn"></a> [acm\_certificate\_arn](#input\_acm\_certificate\_arn) | n/a | `string` | `null` | no |
| <a name="input_alb_arn_suffix"></a> [alb\_arn\_suffix](#input\_alb\_arn\_suffix) | n/a | `string` | `null` | no |
| <a name="input_alb_name"></a> [alb\_name](#input\_alb\_name) | n/a | `string` | `null` | no |
| <a name="input_alb_target_group_arn_suffix"></a> [alb\_target\_group\_arn\_suffix](#input\_alb\_target\_group\_arn\_suffix) | n/a | `string` | `null` | no |
| <a name="input_alb_target_group_name"></a> [alb\_target\_group\_name](#input\_alb\_target\_group\_name) | n/a | `string` | `null` | no |
| <a name="input_cron_monitoring"></a> [cron\_monitoring](#input\_cron\_monitoring) | n/a | <pre>object({<br> mysql_host = string<br> mysql_database = string<br> mysql_user = string<br> mysql_password_secret_name = string<br> vpc_id = string<br> subnet_ids = list(string)<br> rds_security_group_id = string<br> delay_tolerance = string<br> run_interval = string <br> })</pre> | `null` | no |
| <a name="input_albs"></a> [albs](#input\_albs) | n/a | <pre>list(object({<br> name = string<br> arn_suffix = string<br> target_group_name = string<br> target_group_arn_suffix = string<br> min_containers = optional(string, 1)<br> ecs_service_name = string<br> }))</pre> | `[]` | no |
| <a name="input_cron_monitoring"></a> [cron\_monitoring](#input\_cron\_monitoring) | n/a | <pre>object({<br> mysql_host = string<br> mysql_database = string<br> mysql_user = string<br> mysql_password_secret_name = string<br> vpc_id = string<br> subnet_ids = list(string)<br> rds_security_group_id = string<br> delay_tolerance = string<br> run_interval = string<br> log_retention_in_days = optional(number, 7)<br> })</pre> | `null` | no |
| <a name="input_customer_prefix"></a> [customer\_prefix](#input\_customer\_prefix) | n/a | `string` | `"fleet"` | no |
| <a name="input_default_sns_topic_arns"></a> [default\_sns\_topic\_arns](#input\_default\_sns\_topic\_arns) | n/a | `list(string)` | `[]` | no |
| <a name="input_fleet_ecs_service_name"></a> [fleet\_ecs\_service\_name](#input\_fleet\_ecs\_service\_name) | n/a | `string` | `null` | no |
| <a name="input_fleet_min_containers"></a> [fleet\_min\_containers](#input\_fleet\_min\_containers) | n/a | `number` | `1` | no |
| <a name="input_mysql_cluster_members"></a> [mysql\_cluster\_members](#input\_mysql\_cluster\_members) | n/a | `list(string)` | `[]` | no |
| <a name="input_redis_cluster_members"></a> [redis\_cluster\_members](#input\_redis\_cluster\_members) | n/a | `list(string)` | `[]` | no |
| <a name="input_sns_topic_arns_map"></a> [sns\_topic\_arns\_map](#input\_sns\_topic\_arns\_map) | n/a | `map(list(string))` | `{}` | no |
Expand Down
42 changes: 27 additions & 15 deletions terraform/addons/monitoring/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -36,31 +36,36 @@ resource "aws_db_event_subscription" "default" {

}

locals {
alb_map = {for k, v in var.albs: k => v}
}


// ECS Alarms
resource "aws_cloudwatch_metric_alarm" "alb_healthyhosts" {
count = var.alb_target_group_arn_suffix == null || var.alb_arn_suffix == null ? 0 : 1
alarm_name = "backend-healthyhosts-${var.customer_prefix}"
for_each = local.alb_map
alarm_name = "backend-healthyhosts-${var.customer_prefix}-${each.value.name}"
comparison_operator = "LessThanThreshold"
evaluation_periods = "1"
metric_name = "HealthyHostCount"
namespace = "AWS/ApplicationELB"
period = "60"
statistic = "Minimum"
threshold = var.fleet_min_containers
alarm_description = "This alarm indicates the number of Healthy Fleet hosts is lower than expected. Please investigate the load balancer \"${var.alb_name}\" or the target group \"${var.alb_target_group_name}\" and the fleet backend service \"${var.fleet_ecs_service_name}\""
threshold = each.value.min_containers
alarm_description = "This alarm indicates the number of Healthy Fleet hosts is lower than expected. Please investigate the load balancer \"${each.value.name}\" or the target group \"${each.value.target_group_name}\" and the fleet backend service \"${each.value.ecs_service_name}\""
actions_enabled = "true"
alarm_actions = lookup(var.sns_topic_arns_map, "alb_helthyhosts", var.default_sns_topic_arns)
ok_actions = lookup(var.sns_topic_arns_map, "alb_helthyhosts", var.default_sns_topic_arns)
dimensions = {
TargetGroup = var.alb_target_group_arn_suffix
LoadBalancer = var.alb_arn_suffix
TargetGroup = each.value.target_group_arn_suffix
LoadBalancer = each.value.arn_suffix
}
}

// alarm for target response time (anomaly detection)
resource "aws_cloudwatch_metric_alarm" "target_response_time" {
count = var.alb_target_group_arn_suffix == null || var.alb_arn_suffix == null ? 0 : 1
alarm_name = "backend-target-response-time-${var.customer_prefix}"
for_each = local.alb_map
alarm_name = "backend-target-response-time-${var.customer_prefix}-${each.value.name}"
comparison_operator = "GreaterThanUpperThreshold"
evaluation_periods = "2"
threshold_metric_id = "e1"
Expand All @@ -87,19 +92,26 @@ resource "aws_cloudwatch_metric_alarm" "target_response_time" {
unit = "Count"

dimensions = {
TargetGroup = var.alb_target_group_arn_suffix
LoadBalancer = var.alb_arn_suffix
TargetGroup = each.value.target_group_arn_suffix
LoadBalancer = each.value.arn_suffix
}
}
}
}

locals {
http_5xx_alert_names = ["HTTPCode_ELB_5XX_Count", "HTTPCode_Target_5XX_Count"]
http_5xx_alerts_list = flatten([for alert in local.http_5xx_alert_names : [for alb in var.albs : merge(alb, { "alert" : alert })]])
http_5xx_alerts = {for k, v in local.http_5xx_alerts_list : k => v}
}


resource "aws_cloudwatch_metric_alarm" "lb" {
for_each = var.alb_target_group_arn_suffix == null ? toset([]) : toset(["HTTPCode_ELB_5XX_Count", "HTTPCode_Target_5XX_Count"])
alarm_name = "${var.customer_prefix}-lb-${each.key}"
for_each = local.http_5xx_alerts
alarm_name = "${var.customer_prefix}-lb-${each.value.name}-${each.value.alert}"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "1"
metric_name = each.key
metric_name = each.value.alert
namespace = "AWS/ApplicationELB"
period = "120"
statistic = "Sum"
Expand All @@ -109,7 +121,7 @@ resource "aws_cloudwatch_metric_alarm" "lb" {
ok_actions = lookup(var.sns_topic_arns_map, "alb_httpcode_5xx", var.default_sns_topic_arns)
treat_missing_data = "notBreaching"
dimensions = {
LoadBalancer = var.alb_arn_suffix
LoadBalancer = each.value.arn_suffix
}
}

Expand Down Expand Up @@ -280,7 +292,7 @@ resource "null_resource" "cron_monitoring_build" {
go_mod_changes = filesha256("${path.module}/lambda/go.mod")
go_sum_changes = filesha256("${path.module}/lambda/go.sum")
# Make sure to always have a unique trigger if the file doesn't exist
binary_exists = fileexists(local.cron_lambda_binary) ? true : timestamp()
binary_exists = fileexists(local.cron_lambda_binary) ? true : timestamp()
}
provisioner "local-exec" {
working_dir = "${path.module}/lambda"
Expand Down
33 changes: 10 additions & 23 deletions terraform/addons/monitoring/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,29 +8,16 @@ variable "fleet_ecs_service_name" {
default = null
}

variable "fleet_min_containers" {
type = number
default = 1
}

variable "alb_name" {
type = string
default = null
}

variable "alb_target_group_name" {
type = string
default = null
}

variable "alb_target_group_arn_suffix" {
type = string
default = null
}

variable "alb_arn_suffix" {
type = string
default = null
variable "albs" {
type = list(object({
name = string
arn_suffix = string
target_group_name = string
target_group_arn_suffix = string
min_containers = optional(string, 1)
ecs_service_name = string
}))
default = []
}

variable "default_sns_topic_arns" {
Expand Down
58 changes: 58 additions & 0 deletions terraform/addons/saml-auth-proxy/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
## Requirements

No requirements.

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | 5.17.0 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_saml_auth_proxy_alb"></a> [saml\_auth\_proxy\_alb](#module\_saml\_auth\_proxy\_alb) | terraform-aws-modules/alb/aws | 8.2.1 |

## Resources

| Name | Type |
|------|------|
| [aws_cloudwatch_log_group.saml_auth_proxy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_group) | resource |
| [aws_ecs_service.saml_auth_proxy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ecs_service) | resource |
| [aws_ecs_task_definition.saml_auth_proxy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ecs_task_definition) | resource |
| [aws_iam_policy.saml_auth_proxy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_secretsmanager_secret.saml_auth_proxy_cert](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/secretsmanager_secret) | resource |
| [aws_security_group.saml_auth_proxy_alb](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group) | resource |
| [aws_security_group.saml_auth_proxy_service](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group) | resource |
| [aws_iam_policy_document.saml_auth_proxy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_alb_target_group_arn"></a> [alb\_target\_group\_arn](#input\_alb\_target\_group\_arn) | n/a | `string` | n/a | yes |
| <a name="input_base_url"></a> [base\_url](#input\_base\_url) | n/a | `string` | n/a | yes |
| <a name="input_cookie_max_age"></a> [cookie\_max\_age](#input\_cookie\_max\_age) | n/a | `string` | `"1h"` | no |
| <a name="input_customer_prefix"></a> [customer\_prefix](#input\_customer\_prefix) | customer prefix to use to namespace all resources | `string` | `"fleet"` | no |
| <a name="input_ecs_cluster"></a> [ecs\_cluster](#input\_ecs\_cluster) | n/a | `string` | n/a | yes |
| <a name="input_ecs_execution_iam_role_arn"></a> [ecs\_execution\_iam\_role\_arn](#input\_ecs\_execution\_iam\_role\_arn) | n/a | `string` | n/a | yes |
| <a name="input_ecs_iam_role_arn"></a> [ecs\_iam\_role\_arn](#input\_ecs\_iam\_role\_arn) | n/a | `string` | n/a | yes |
| <a name="input_idp_metadata_url"></a> [idp\_metadata\_url](#input\_idp\_metadata\_url) | n/a | `string` | n/a | yes |
| <a name="input_logging_options"></a> [logging\_options](#input\_logging\_options) | n/a | <pre>object({<br> awslogs-group = string<br> awslogs-region = string<br> awslogs-stream-prefix = string<br> })</pre> | n/a | yes |
| <a name="input_proxy_containers"></a> [proxy\_containers](#input\_proxy\_containers) | n/a | `number` | `1` | no |
| <a name="input_saml_auth_proxy_image"></a> [saml\_auth\_proxy\_image](#input\_saml\_auth\_proxy\_image) | n/a | `string` | `"itzg/saml-auth-proxy:1.12.0@sha256:ddff17caa00c1aad64d6c7b2e1d5eb93d97321c34d8ad12a25cfd8ce203db723"` | no |
| <a name="input_security_groups"></a> [security\_groups](#input\_security\_groups) | n/a | `list(string)` | n/a | yes |
| <a name="input_subnets"></a> [subnets](#input\_subnets) | n/a | `list(string)` | n/a | yes |
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | n/a | `string` | n/a | yes |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_fleet_extra_execution_policies"></a> [fleet\_extra\_execution\_policies](#output\_fleet\_extra\_execution\_policies) | n/a |
| <a name="output_lb"></a> [lb](#output\_lb) | n/a |
| <a name="output_lb_target_group_arn"></a> [lb\_target\_group\_arn](#output\_lb\_target\_group\_arn) | Keep for legacy support for now |
| <a name="output_name"></a> [name](#output\_name) | n/a |
| <a name="output_secretsmanager_secret_id"></a> [secretsmanager\_secret\_id](#output\_secretsmanager\_secret\_id) | n/a |
5 changes: 5 additions & 0 deletions terraform/addons/saml-auth-proxy/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,15 @@ output "name" {
value = "${var.customer_prefix}-saml-auth-proxy"
}

# Keep for legacy support for now
output "lb_target_group_arn" {
value = module.saml_auth_proxy_alb.target_group_arns[0]
}

output "lb" {
value = module.saml_auth_proxy_alb
}

output "secretsmanager_secret_id" {
value = aws_secretsmanager_secret.saml_auth_proxy_cert.id
}

0 comments on commit c10c75c

Please sign in to comment.