This guide provides detailed, step-by-step instructions for installing and configuring Obsrv on AWS, utilizing Terraform, Terragrunt, and Helm.
- CPU Requirements:
- Minimum: 19 CPUs.
- Optimal Configuration: 5 nodes with 4 cores each, totaling 80GB of RAM.
The installation package includes both lakehouse and real-time OLAP storage by default. If the lakehouse component is not required, only the real-time OLAP storage can be installed, reducing requirements to 16 CPUs and 64GB of RAM.
In this case, we recommend using 2 nodes with 8 cores each, totaling 64GB of RAM, by selecting the t2.2xlarge
AWS instance type.
- Availability Zones: All instances should be within the same availability zone to minimize cross-zone data transfer costs. The Obsrv installer will automatically create the EKS (Elastic Kubernetes Service) cluster for you.
- CIDR Block: Use a
/23
CIDR range (512 IPs) for your environment.- Example: A VPC with
10.0.0.0/23
provides IPs from10.0.0.0
to10.0.1.255
.
- Example: A VPC with
- Subnets: Ensure subnets are created in all availability zones within your AWS region.
Before beginning the installation, make sure the following tools are installed on your Linux-based system:
Tool | Version | Installation Command | Official Documentation |
---|---|---|---|
Terraform | 1.5.x or earlier | curl "https://releases.hashicorp.com/terraform/1.5.2/terraform_1.5.2_linux_amd64.zip" -o terraform.zip && unzip terraform.zip && sudo mv terraform /usr/local/bin/ && rm terraform.zip |
Terraform Install |
Terragrunt | 0.48 or later | curl -OL https://github.com/gruntwork-io/terragrunt/releases/download/v0.49.0/terragrunt_linux_amd64 && sudo mv terragrunt_linux_amd64 /usr/local/bin/terragrunt && sudo chmod +x /usr/local/bin/terragrunt |
Terragrunt Install |
Helm | 3.10.2 or later | curl https://get.helm.sh/helm-v3.10.2-linux-amd64.tar.gz -o helm.tar.gz && tar -zxvf helm.tar.gz && sudo mv linux-amd64/helm /usr/local/bin/ |
Helm Install |
AWS CLI | 2.10 or later | curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && sudo ./aws/install |
AWS CLI Install |
Start by cloning the Obsrv automation repository and checkout to either latest release tag or master.
git clone https://github.com/Sunbird-Obsrv/obsrv-automation.git
By executing the following commands which will bring up the kubernetes cluster in the AWS environment of configured region.
-
Navigate to the Configuration Directory:
cd ./obsrv-automation/terraform/aws/vars
-
Update Configuration Files:
- Open
cluster_overides.tf
and modify the configuration values to match your environment.
building_block = "obsrv" env = "dev" region = "us-east-2" availability_zones = ["us-east-2a", "us-east-2b", "us-east-2c"] timezone = "UTC" create_kong_ingress = "true" create_vpc = "true" create_velero_user = "true" eks_node_group_instance_type = ["t2.xlarge"] # Choose depending on your requirements by considering the CPU requirements eks_node_group_capacity_type = "ON_DEMAND" eks_node_group_scaling_config = { desired_size = 5, max_size = 5, min_size = 1 } # Choose depending on your requirements by considering the CPU requirements eks_node_disk_size = 100
- Open
-
Configure S3 for Cluster State:
- Open
obsrv.conf
and update your AWS credentials and bucket names.
AWS_ACCESS_KEY_ID=<your_access_key_id> AWS_SECRET_ACCESS_KEY=<your_secret_access_key> AWS_DEFAULT_REGION="us-east-2" KUBE_CONFIG_PATH="$HOME/.kube/obsrv-kube-config.yaml" AWS_TERRAFORM_BACKEND_BUCKET_NAME="obsrv-tfstate" AWS_TERRAFORM_BACKEND_BUCKET_REGION="us-east-2"
- Open
-
Make the Script Executable:
chmod +x ./obsrv.sh
-
Run the Installation:
- To start the installation, run the script:
./obsrv.sh install --config ./obsrv.conf --install_dependencies false
- If you want the installer to automatically handle dependencies, set
install_dependencies=true
.
Once the installation completes, verify that your Kubernetes cluster is up and running:
kubectl get nodes
This should show the nodes in your Kubernetes cluster.
cd ./obsrv-automation/helmcharts/
Modify global-cloud-values-aws.yaml
with the appropriate values for your environment:
global:
cloud_storage_provider: "aws"
cloud_store_provider: "s3"
cloud_storage_region: "<region>"
dataset_api_cloud_bucket: "<dataset_bucket_name>"
config_api_cloud_bucket: "<config_bucket_name>"
postgresql_backup_cloud_bucket: "<backup_bucket_name>"
redis_backup_cloud_bucket: "<redis_backup_bucket_name>"
velero_backup_cloud_bucket: "<velero_backup_bucket_name>"
cloud_storage_bucket: "<storage_bucket_name>"
hudi_metadata_bucket: "s3a://<hudi_bucket_name>/hudi"
cloud_storage_config: |
'{"identity":"<access-key>","credential":"<secret-key>","region":"<region-name>"}'
storage_class_name: "gp2"
checkpoint_bucket: "s3://<checkpoint-bucket-name>"
s3_access_key: "<aws-access-key>"
s3_secret_key: "<aws-secret-key>"
kong_annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-eip-allocations: "<elastic-ip>"
service.beta.kubernetes.io/aws-load-balancer-subnets: "<subnet-id>"
service_accounts:
enabled: true
secor: eks.amazonaws.com/role-arn: "<role-arn>"
dataset_api: eks.amazonaws.com/role-arn: "<role-arn>"
config_api: eks.amazonaws.com/role-arn: "<role-arn>"
druid_raw: eks.amazonaws.com/role-arn: "<role-arn>"
flink: eks.amazonaws.com/role-arn: "<role-arn>"
postgresql_backup: eks.amazonaws.com/role-arn: "<role-arn>"
redis_backup: eks.amazonaws.com/role-arn: "<role-arn>"
s3_exporter: eks.amazonaws.com/role-arn: "<role-arn>"
spark: eks.amazonaws.com/role-arn: "<role-arn>"
velero-backup:
credentials:
useSecret: true
secretContents:
cloud: |
[default]
aws_access_key_id="<aws-access-key>"
aws_secret_access_key="<aws-secret-key>"
trino:
additionalCatalogs:
lakehouse: |-
connector.name=hudi
hive.metastore.uri=thrift://hudi-hms.hms.svc:9083
hive.s3.aws-access-key=<aws-access-key>
hive.s3.aws-secret-key=<aws-secret-key>
hive.s3.ssl.enabled=false
In global-values.yaml
, replace <domain>
with your actual domain or Elastic IP:
domain: "<domain>.sslip.io"
Follow these steps to generate and configure the private and public keys for the web console and dataset API:
Run the following command to generate a private key for the web console:
openssl genpkey -algorithm RSA -out private_key.pem -pkeyopt rsa_keygen_bits:2048
- Open the generated
private_key.pem
file. - Copy its contents and update the
USER_TOKEN_PRIVATE_KEY
field in the following file:
obsrv-automation/helmcharts/services/web-console/values.yaml
Example:
USER_TOKEN_PRIVATE_KEY: |-
<paste-private-key-here>
Using the private key generated above, create a public key with the following command:
openssl rsa -pubout -in private_key.pem -out public_key.pem
- Open the generated
public_key.pem
file. - Copy its contents and update the
user_token_public_key
field in the following file:
obsrv-automation/helmcharts/services/dataset-api/values.yaml
Example:
user_token_public_key: <paste-public-key-here>
Make the script executable and set the environment variables and run the installation:
export cloud_env=aws
export AWS_ACCESS_KEY_ID=<aws-access-key>
export AWS_SECRET_ACCESS_KEY=<aws-secret-key>
export AWS_DEFAULT_REGION=<aws-region>
export KUBE_CONFIG_PATH="$HOME/.kube/obsrv-kube-config.yaml"
export KUBECONFIG="$HOME/.kube/obsrv-kube-config.yaml"
chmod +x ./kitchen/install.sh
./kitchen/install.sh all
After completing the installation, follow these steps to verify that all components are running correctly:
-
Verify all pods are running:
kubectl get pods -A
All pods should be in
Running
state. Common namespaces to check:flink
: Core Pipelinemonitoring
: Monitoring stackdataset-api
: Dataset APIsweb-console
: Dataset Management console
-
Check Services:
kubectl get svc -A
Verify that essential services have external IPs assigned, particularly the Kong service.
If any component fails these checks, refer to the component-specific logs:
kubectl logs -f <pod-name> -n <namespace>
-
Pull the Latest Code:
cd ./obsrv-automation git pull cd ./automation-scripts/infra-setup
-
Update Configurations: Review and update configuration values as needed.
-
Run Terraform for Upgrade:
./obsrv.sh install --config ./obsrv.conf --install_dependencies false
-
Upgrade with Updated Cloud Values:
export cloud_env=aws export AWS_ACCESS_KEY_ID=<aws-access-key> export AWS_SECRET_ACCESS_KEY=<aws-secret-key> export AWS_DEFAULT_REGION=<aws-region> export KUBE_CONFIG_PATH="$HOME/.kube/obsrv-kube-config.yaml" export KUBECONFIG="$HOME/.kube/obsrv-kube-config.yaml" chmod +x ./kitchen/install.sh ./kitchen/install.sh all
By following these steps, you will ensure a successful installation and configuration of Obsrv on AWS.