Skip to content

Commit

Permalink
test(multicloud): Add EKS module, demo stack and tests (#1390)
Browse files Browse the repository at this point in the history
# Description

* Create EKS module
* Create EKS example
* Create EKS unit and integration test with retina
* Create live/retina-eks to demo multi-cloud
* Update docs
* Update diagrams
* Update Makefile for this sub-project test/multicloud

## Related Issue

#1267 

## Checklist

- [x] I have read the [contributing
documentation](https://retina.sh/docs/Contributing/overview).
- [x] I signed and signed-off the commits (`git commit -S -s ...`). See
[this
documentation](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification)
on signing commits.
- [x] I have correctly attributed the author(s) of the code.
- [x] I have tested the changes locally.
- [x] I have followed the project's style guidelines.
- [x] I have updated the documentation, if necessary.
- [x] I have added tests, if applicable.

## Screenshots (if applicable) or Testing Completed

Grafana Hubble DNS dashboard for EKS cluster

![Screenshot_26-2-2025_141028_srodi grafana
net](https://github.com/user-attachments/assets/d5e43699-83f9-429f-b7df-127a6e238859)

EKS cluster showing AWS nodes and retina logs

![Screenshot 2025-02-26
131742](https://github.com/user-attachments/assets/2bb9ec2c-7b13-40af-b10e-607e02467ffa)

## Additional Notes

Add any additional notes or context about the pull request here.

---

Please refer to the [CONTRIBUTING.md](../CONTRIBUTING.md) file for more
information on how to contribute to this project.
  • Loading branch information
SRodi authored Feb 27, 2025
1 parent 6883d41 commit 3fdf4a5
Show file tree
Hide file tree
Showing 34 changed files with 2,991 additions and 555 deletions.
12 changes: 12 additions & 0 deletions test/multicloud/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,15 @@ apply:
cd live/$(STACK_NAME) && \
tofu apply --auto-approve

check-env-vars:
@if [ -z "$(GRAFANA_AUTH)" ]; then echo "GRAFANA_AUTH is not set"; exit 1; fi
@if [ -z "$(STACK_NAME)" ]; then echo "STACK_NAME is not set"; exit 1; fi
@if [ "$(STACK_NAME)" = "retina-gke" ] && [ -z "$(GOOGLE_APPLICATION_CREDENTIALS)" ]; then echo "GOOGLE_APPLICATION_CREDENTIALS is not set"; exit 1; fi
@if [ "$(STACK_NAME)" = "retina-eks" ] && [ -z "$(AWS_SECRET_ACCESS_KEY)" ]; then echo "AWS_SECRET_ACCESS_KEY is not set"; exit 1; fi
@if [ "$(STACK_NAME)" = "retina-eks" ] && [ -z "$(AWS_ACCESS_KEY_ID)" ]; then echo "AWS_ACCESS_KEY_ID is not set"; exit 1; fi

quick:
@make check-env-vars
@make plan
@make apply

Expand All @@ -23,6 +31,10 @@ aks: export STACK_NAME=$(PREFIX)-aks
aks:
@make quick

eks: export STACK_NAME=$(PREFIX)-eks
eks:
@make quick

kind: export STACK_NAME=$(PREFIX)-kind
kind:
@make quick
Expand Down
33 changes: 30 additions & 3 deletions test/multicloud/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ An example Hubble UI visualization on GKE dataplane v1 (no Cilium). [See GKE net

* [aks](./modules/aks/): Deploy Azure Kubernetes Service cluster.
* [gke](./modules/gke/): Deploy Google Kubernetes Engine cluster.
* [eks](./modules/eks/): Deploy Elastic Kubernetes Service cluster.
* [kind](./modules/kind/): Deploy KIND cluster.
* [helm-release](./modules/helm-release/): Deploy a Helm Chart, used to deploy Retina and Prometheus.
* [kubernetes-lb](./modules/kubernetes-lb/): Create a Kubernetes Service of type Load Balancer, used to expose Prometheus.
Expand Down Expand Up @@ -48,6 +49,19 @@ An example Hubble UI visualization on GKE dataplane v1 (no Cilium). [See GKE net
export GOOGLE_APPLICATION_CREDENTIALS=/Users/srodi/src/retina/test/multicloud/live/retina-gke/service-key.json
```

* EKS:
1. Create an AWS account
2. Create a user and assign required policies to create VPC, Subnets, Security Groups, IAM roles, EKS and workers
3. [Install AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
4. Create required `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` for the new user

To deploy an EKS cluster export `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` as env variables.

```sh
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
```

* Grafana

1. Set up a [Grafana Cloud free account](https://grafana.com/pricing/) and start an instance.
Expand Down Expand Up @@ -85,6 +99,12 @@ Format code, initialize OpenTofu, plan and apply the stack to create infra and d
make gke
```

* EKS:

```sh
make eks
```

* Kind:

```sh
Expand All @@ -93,13 +113,13 @@ Format code, initialize OpenTofu, plan and apply the stack to create infra and d

### Clean up

To destroy the cluster specify the `STACK_NAME` and run `make clean`.
To destroy the cluster specify the `STACK_NAME` and run `make destroy`.

```sh
# destroy AKS and cleanup local state files
# set a different stack as needed (i.e. retina-gke, retina-kind)
export STACK_NAME=retina-aks
make clean
make destroy
```

### Test
Expand All @@ -116,6 +136,7 @@ Resources documentation:

* [GKE](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster)
* [AKS](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster)
* [EKS](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_cluster)
* [Kind](https://registry.terraform.io/providers/tehcyx/kind/latest/docs/resources/cluster)
* [Helm Release](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release)
* [Kubernetes LB Service](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/service)
Expand All @@ -132,12 +153,18 @@ Here is an example on how to import resources for `modules/gke`:
# i.e. examples/gke
tofu import module.gke.google_container_cluster.gke europe-west2/test-gke-cluster
tofu import module.gke.google_service_account.default projects/mc-retina/serviceAccounts/test-gke-service-account@mc-retina.iam.gserviceaccount.com
# i.e. examples/eks
tofu import module.eks.aws_eks_node_group.node_group mc-test-aks:mc-test-node-group
tofu import module.eks.aws_iam_role.eks_node_group_role mc-test-eks-node-group-role
tofu import module.eks.aws_iam_role_policy_attachment.eks_node_group_AmazonEKS_CNI_Policy "mc-test-eks-node-group-role/arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
tofu import module.eks.aws_iam_role_policy_attachment.eks_node_group_AmazonEKSWorkerNodePolicy "mc-test-eks-node-group-role/arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
```

>Note: each resource documentation contains a section on how to import resources into the State. [Example for google_container_cluster resource](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster#import).

## Multi-Cloud

The [live/](./live/) directory contains the multi-cloud / multi-cluster stacks to deploy clusters, install Retina, install Prometheus, expose all Prometheus using load blanaces, and configure a Grafana Cloud instance to consume prometheus data sources to visualize multiple cluster in a single Grafana dashboard.
The [live/](./live/) directory contains multi-cloud / multi-cluster stacks to deploy cloud infrastructure, install Retina, install Prometheus, expose Prometheus instance using a load balancer, and configure a Grafana Cloud instance to consume Prometheus data sources to visualize Retina metrics from multiple clusters in a single Grafana dashboard.

![Architecture Diagram](./diagrams/diagram-mc.svg)
Loading

0 comments on commit 3fdf4a5

Please sign in to comment.