The AWS control plane contains a rich set of information that can be operationally very useful! Have you ever found yourself wishing you had the output of an AWS SDK Command to enrich your dashboards or alerts?
This project aims to make that easy by using a simple YAML configuration file to create a container that can be 'scraped' by Prometheus to produce metrics from AWS SDK Command outputs!
Some common use cases:
- Use the
DescribeSubnets
command to retrieve the remaining number of IP Addresses. - Use the
DescribeFilesystems
command for FSx for Lustre to determine the current state of a filesystem. - Use the
ListServiceQuotasCommand
command to determine how many ECS Tasks you're permitted to run in your account.
The first option is simply create a container image that you can choose to deploy in whatever manner you like.
This container image will start a Node/Express server listening on port 4000 and will publish Prometheus metrics to the /metrics endpoint.
This provides flexibility to use anything from EKS to ECS, to whatever container platform you choose!
Note the container will need access to an AWS IAM Role either through metadata, or environment variables to function!
The second option is full deployment to an EKS cluster.
This option uses the AWS CDK to deploy an IAM Role that your EKS Pods can use via IRSA
Finally, it will create the Kubernetes Deployment on your behalf using the Kubernetes SDK.
All the required development libraries are tools are available in an AWS Cloud9 developer environment.
To get started with Cloud9, follow this getting started guide here.
Cloud9 is the easiest path forward for those who prefer not to install and configure these tools locally!
If you wish to configure your local environment, then continue below.
Install the AWS CLI as outlined here.
Install NodeJS. Any version between 14 and 18 should work fine. You can find instructions here
Install the 'make' command. Running the command make -v
should produce an output.
Install a Docker client suitable for your operating system. Running the command docker -v
should produce an output.
Install the 'kubectl' command suitable for your operating system. You can find instructions here
Install required libraries by running this command:
npm install
Configure the AWS CLI with credentials for your AWS Account (you can skip this step in Cloud9). Instructions here
Configure the kubectl command to point to your EKS Cluster. Instructions here
Follow the steps outlined here:
- Create a Cloud9 Environment following all steps in the EKS Workshop section 'Start the Workshop...' here: https://www.eksworkshop.com/020_prerequisites/
- Create an EKS Cluster following all steps in the EKS Workshop section 'Launch using eksctl' section here: https://www.eksworkshop.com/030_eksctl/
- Configure / Install Amazon Managed Prometheus (AMP) on the cluster above by following the quickstart guide here: https://aws.amazon.com/blogs/mt/getting-started-amazon-managed-service-for-prometheus/
- Configure Amazon Managed Grafana with the AMP environment above as a data-source. Quickstart guide here: https://docs.aws.amazon.com/grafana/latest/userguide/getting-started-with-AMG.html
Once you've completed the setup steps outlined in Setup Our Build Environment we can start building!
Let's begin by using one of the included configuration files and run it locally to test that things work.
Run the command:
make config=subnet-remaining-ips.yaml runLocal
This will parse the configuration file and assure it is well-formed. It will install the required SDK libraries, and run a node server locally on port 4000 that will begin producing metrics!
Your output will look something like this:
...
Verifying values in our configuration file are correct. . .
No configuration errors present in 'subnet-remaining-ips.yaml'! Recommend verifying using runLocal before deployment.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ Running the node.js server locally. Connect on http://localhost:4000/metrics @@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
node build/executor/lib/index.js
server running on port 4000
After about a minute goes past you'll see a message like this:
Successfully executed command for metric remaining_ips. Converting to Prometheus metric values.
Now open a browser to this address to see the output!
This will look something like this:
# HELP awscli_remaining_ips Remaining IP Addresses for Subnet # TYPE awscli_remaining_ips gauge awscli_remaining_ips{subnet_id="subnet-abcd1234",availability_zone="us-east-1f",availability_zone_id="use1-az5"} 8126 awscli_remaining_ips{subnet_id="subnet-efgh5678",availability_zone="us-east-1c",availability_zone_id="use1-az6"} 8187
You should always Test your configuration file before building or deploying a container. This will verify your configuration is well-formed and working!
We've verified that our configuration file runs and collects the metrics we expect! Now let's build a Docker image.
make config=subnet-remaining-ips.yaml docker
Your output will look something like this:
...
=> [2/5] WORKDIR /app 0.1s
=> [3/5] COPY ./build/ /app/ 0.1s
=> [4/5] COPY ./config/ /app/config/ 0.0s
=> [5/5] RUN npm install 35.7s
=> exporting to image 4.8s
=> => exporting layers 4.8s
=> => writing image sha256:f5e09affe74392864c61ecab1c775910151778f793af546e65fea8cc2a2e12f9 0.0s
=> => naming to docker.io/library/aws-prom-sdk-metrics:latest
A docker image will exist locally named aws-sdk-prom-metrics:latest
that you can now publish to a docker registry for deployment!
If you choose to use EKS, continue with deploy to EKS. If you're using another container orchestration environment you will need to run the built container in that system. It will require access to an AWS Principal (role/user) with permissions to run the SDK calls you've configured. Once started a Prometheus operator will need to scrape for metrics on http port 4000 on the /metrics
URL.
Make sure you've tested your configuration file and built a docker container
Create an ECR registry in the region and account you're running your EKS cluster. You can choose any name you like as long as you update your configuration file below.
For our example let's make a new one called aws-sdk-prom-metrics
. Instructions are here
Now sign your Docker environment into your private registry following these instructions.
After pushing your image, you should have the container image we created above available in your environment.
The commands below will do this for you. Update the CONFIG_FILE variable to match the configuration file you're building and pushing.
export CONFIG_FILE=subnet-remaining-ips.yaml
aws ecr create-repository --repository-name aws-sdk-prom-metrics
export REPOSITORY_URI=$(aws ecr describe-repositories --repository-name aws-sdk-prom-metrics | jq -r '.repositories[0].repositoryUri')
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin ${REPOSITORY_URI}
make config=${CONFIG_FILE} docker
docker tag aws-sdk-prom-metrics:latest ${REPOSITORY_URI}:latest
docker push ${REPOSITORY_URI}
The included example configuration files in the config
directory all included commented out information relating to deployment.
Uncomment / add the deployment section of the configuration like:
deploymentConfig:
# Create / Determine your OIDC provider. See: https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html
oidcProvider: oidc.eks.[region].amazonaws.com/id/[identifier]
# The namespace that will be used for our metric gathering container. It will be created if it doesn't exist.
namespace: aws-sdk-prom-metrics
# Your AWS Account ID. Used while constructing the roles trust information.
awsAccountId: "012345678910"
# The URI to retrieve the container image you built with 'make docker' and pushed to.
imageUri: 012345678910.dkr.ecr.[region].amazonaws.com/aws-sdk-prom-metrics:label
# The Service Account Name that will be created in the EKS cluster.
serviceAccountName: aws-sdk-prom-metrics
You must update oidcProvider
, awsAccountId
, and imageUri
. Assure the awsAccountId
remains in quotes even though it's numeric!
Uncomment / add the required permissions for each of the metrics you're collecting. These permissions are used to create the role. For example:
iamPermissions:
actions:
- ec2:DescribeSubnets
resources:
# You can limit this scope to specific Subnet resource ARNs if they are known up-front
- '*'
In YAML the * needs to be quoted like shown - '*'.
This tool can use the AWS CDK to deploy a correctly formed role and policy for you automatically.
NOTE: the default role name is aws-sdk-prom-metrics-role
and default CloudFormation stack name is aws-sdk-prom-metrics-role-stack
.
- To specify a different Role Name add
iamRoleName
with your chosen value to thedeploymentConfig
in your configuration file. - To specify a different CloudFormation stack name add
stackName
with your chosen value to thedeploymentConfig
in your configuration file.
Deploy the role by running:
make config=subnet-remaining-ips.yaml deployRole
You should see output similar to this showing that CloudFormation (by way of the CDK) is deploying a role to your account!
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ Using the CDK to deploy the IRSA IAM Role @@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
cdk deploy --require-approval=never
✨ Synthesis time: 6.36s
aws-sdk-prom-metrics-role-stack: deploying...
aws-sdk-prom-metrics-role-stack: creating CloudFormation changeset...
✅ aws-sdk-prom-metrics-role-stack
✨ Deployment time: 34.1s
A benefit of using the CDK in this case is updated to your configuration file can be handled via a Stack update versus needing to destroy and rebuild.
If you add more metrics, or make updates that would require an update to your IAM permissions, simply re-run the deployRole
command as shown above!
This will create a new 'Deployment' in the namespace you've specified in your configuration file, using the IRSA role we created earlier.
Run the command:
make config=subnet-remaining-ips.yaml deployEks
You'll see output similar to this:
...
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ Using the kubectl API to deploy the namespace and metric-gather deployment @@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
npm run deployEks
> aws-sdk-prom-metrics@0.0.1 deployEks
> node build/deployer/lib/deployEks.js
Creating namespace 'aws-sdk-prom-metrics'
Creating Service Account named 'aws-sdk-prom-metrics'.
Creating Deployment named 'aws-sdk-prom-metrics-collector-deployment'
After a few moments, describe the namespace, and you should see everything running!
kubectl get all -n aws-sdk-prom-metrics
Which should produce an output like:
NAME READY STATUS RESTARTS AGE
pod/aws-sdk-prom-metrics-collector-deployment-879dbccb7-z6fsr 1/1 Running 0 43s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/aws-sdk-prom-metrics-collector-deployment 1/1 1 1 11m
NAME DESIRED CURRENT READY AGE
replicaset.apps/aws-sdk-prom-metrics-collector-deployment-879dbccb7 1 1 1 11m
You can also verify by port forwarding to port 4000 that your /metrics
URL produces results!
Search within Prometheus for metrics starting with awssdk
and you should see your new metric is flowing.
If you have a novel use-case that exists outside the examples, you will need to create your own configuration file.
Once you know the command you need to run, you'll need to identify the name of the JavaScritpt AWS SDK v3 library to use.
Start with the documentation here
The Clients list in the sidebar is the name that should be the value used for sdkLibrary
in the configuration file.
As an example if I was going to query an EFS Filesystem to determine the provisioned throughput, I'd use the value @aws-sdk/client-efs
.
Clicking on the SDK Library above will bring you to a details page. Like this for EFS.
The name of the Client library used is our next parameter. The SDK Libraries usually have two shown in the top right side-bar.
One client is usually a 'v2' compatible style, and the other is 'v3'. We want the 'v3' one for our configuration file.
The 'v3' one usually ends in Client
.
Continuing with our DynamoDB example we can see two 'Clients'. One called EFS
and the other called EFSClient
.
In this case we would use the value EFSClient
for sdkClientName
in our configuration file!
Now we need to find the SDK command to run. The list of commands are listed in the sidebar. We can ignore the ones that end in Input
, and Output
and focus on the ones that end in Command
.
Since I want to Describe my filesystems, I search for describefilesystems and find the DescribeFileSystemsCommand
command.
In this example I would use DescribeFileSystemsCommand
for sdkCommand
in our configuration file.
Clicking the DescribeFileSystemsCommand
I arrive at this page.
Here I have links to the DescribeFileSystemsCommandInput
and DescribeFileSystemsCommandOutput
. I click on the link for DescribeFileSystemsCommandOutput
.
Here I see the response will contain metadata, and a FileSystemDescription[]
. Clicking the link to FileSystemDescription
I finally see what I'm after: ProvisionedThroughputInMibps
.
Clicking the link to ProvisionedThroughputInMibps
I can see the value is undefined | number
. Meaning I can use it for a Gauge value.
Sometimes this can be challenging, but using the AWS CLI Can be helpful.
aws efs describe-file-systems --output json
This produces the same JSON File that our command will be interrogating which can make it a little easier to visualize
{
"FileSystems": [
{
"OwnerId": "01234567890",
"CreationToken": "abcd-efgh",
"FileSystemId": "fs-abcdefgh",
"FileSystemArn": "arn:aws:elasticfilesystem:us-east-2:01234567890:file-system/fs-abcdefgh",
"CreationTime": "2022-06-02T09:33:06-04:00",
"LifeCycleState": "available",
"NumberOfMountTargets": 1,
"SizeInBytes": {
"Value": 28435683328,
"Timestamp": "2022-07-18T10:20:58-04:00",
"ValueInIA": 0,
"ValueInStandard": 28435683328
},
"PerformanceMode": "maxIO",
"Encrypted": false,
"ThroughputMode": "provisioned",
"ProvisionedThroughputInMibps": 1024.0,
"Tags": []
}
]
}
There are JSON pointer tutorials and 'testers' on the web that will let you paste your JSON output and test pointers until you get the data you want. Just be careful not to paste any sensitive information into these sites!
I arrive at the following JSON pointer to get the value I'm interested in $.FileSystems[*].ProvisionedThroughputInMibps
The value of $.FileSystems[*].ProvisionedThroughputInMibps
will be used as my gaugeValue
in my configuration file.
Now my configuration file I've built would look like this:
metricPrefix: awssdk
metrics:
- frequency: 1
metricName: "efs_provisioned_throughput"
metricHelp: Provisioned Throughput of our EFS Filesystems in MiB/s
sdkLibrary: "@aws-sdk/client-efs"
sdkClientName: "EFSClient"
sdkCommand: "DescribeFileSystemsCommand"
gaugeValue: "$.FileSystems[*].ProvisionedThroughputInMibps"
I write this to a file named testing.yaml
in my config
directory and test it as outlined under Testing above.
make config=testing.yaml runLocal
...
npm run validate
> aws-sdk-prom-metrics@0.0.1 validate
> node build/validator/lib/validate.js
Verifying values in our configuration file are correct. . .
No configuration errors present in 'testing.yaml'! Recommend verifying using runLocal before deployment.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ Running the node.js server locally. Connect on http://localhost:4000/metrics @@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
node build/executor/lib/index.js
server running on port 4000
Running command for metric efs_provisioned_throughput
Successfully executed command for metric efs_provisioned_throughput. Converting to Prometheus metric values.
Verify we got some data from /metrics
:
curl localhost:4000/metrics
...
# HELP awssdk_efs_provisioned_throughput Provisioned Throughput of our EFS Filesystems in MiB/s
# TYPE awssdk_efs_provisioned_throughput gauge
awssdk_efs_provisioned_throughput 1024
During testing you will get parsing / validation errors if your JSON pointer isn't correctly formed and / or the SDKs you've referenced don't exist. The tool does it's very best to help you arrive at a syntax that works!
It's worth noting that some commands show their input as being 'optional' (ie string | undefined
) however when they're executed, they will error out stating they require an input. In this case the configuration file will pass validation, but will fail testing using runLocal
. Define the correct sdkCommandInput
in the configuration files to address this.
Please create a pull request with your commented configuration file that describes your use-case for others to benefit from!
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.