Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: automate performance benchmarking #2

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions .github/workflows/perf.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
name: libp2p perf test

# How to configure a repository for running this workflow:
# 1. Run 'make ssh-keygen' in 'perf' to generate a new SSH key pair named 'user' in 'perf/terraform/region/files'
# 2. Export AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for the account of your choice
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 2. Export AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for the account of your choice
# 2. Configure your AWS credentials, e.g. by exporting AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY or writing `~/.aws/credentials` for the account of your choice

Either should still work, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course. I'll link to https://registry.terraform.io/providers/hashicorp/aws/latest/docs#authentication-and-configuration instead which has all the options listed.

# 3. Run 'terraform apply' in 'perf/terraform' to create the AWS resources
# 4. Run 'terraform output' in 'perf/terraform' to get the bucket name
# 5. Go to https://console.aws.amazon.com/iamv2/home?#/users/details/perf?section=security_credentials
# 6. Click 'Create access key' to get the access key ID and secret access key
# 7. Go to https://github.com/libp2p/test-plans/settings/secrets/actions
# 8. Click 'New repository secret', set the name to 'PERF_AWS_SECRET_ACCESS_KEY', and paste the secret access key from step 6
# 9. Click 'New repository secret', set the name to 'PERF_SSH_PRIVATE_KEY', and paste the private key from step 1
# 10. Go to https://github.com/libp2p/test-plans/settings/variables/actions
# 11. Click 'New repository variable', set the name to 'PERF_AWS_ACCESS_KEY_ID', and paste the access key ID from step 6
# 12. Click 'New repository variable', set the name to 'PERF_AWS_BUCKET', and paste the bucket name from step 4

on:
workflow_dispatch:
workflow_call:
# Example:
# uses: libp2p/test-plans/.github/workflows/perf.yml@master
# with:
# aws-access-key-id: ${{ vars.PERF_AWS_ACCESS_KEY_ID }}
# aws-bucket: ${{ vars.PERF_AWS_BUCKET }}
# ref: master
# secrets:
# PERF_AWS_SECRET_ACCESS_KEY: ${{ secrets.PERF_AWS_SECRET_ACCESS_KEY }}
# PERF_SSH_PRIVATE_KEY: ${{ secrets.PERF_SSH_PRIVATE_KEY }}
inputs:
aws-access-key-id:
type: string
required: true
description: The AWS access key ID of 'aws_iam_user.perf'
aws-bucket:
type: string
required: true
description: The AWS bucket as output by 'output.bucket_name'
ref:
type: string
required: false
description: The ref of the test-plans repo to use (defaults to 'master')
default: master
secrets:
PERF_AWS_SECRET_ACCESS_KEY: # The AWS secret access key of 'aws_iam_user.perf'
required: true
PERF_SSH_PRIVATE_KEY: # The SSH private key for 'aws_key_pair.perf'
required: true

jobs:
perf:
name: Perf
runs-on: ubuntu-latest
timeout-minutes: 40
defaults:
run:
shell: bash
working-directory: perf
env:
AWS_ACCESS_KEY_ID: ${{ inputs.aws-access-key-id || vars.PERF_AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.PERF_AWS_SECRET_ACCESS_KEY }}
steps:
- name: Configure SSH
uses: webfactory/ssh-agent@d4b9b8ff72958532804b70bbe600ad43b36d5f2e # v0.8.0
with:
ssh-private-key: ${{ secrets.PERF_SSH_PRIVATE_KEY }}
Comment on lines +62 to +65
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess with the move to AWS launch templates there is no easy way for ephemeral SSH keys? Ephemeral keys would eliminate the need to manage long lived credentials. Please ignore in case ephemeral keys would add more complexity.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep generating keys on the fly. We still do have to keep managing the long-lived AWS credentials. And if SSH keys are ephemeral, then those AWS creds have to be allowed to create AWS key pairs.

I'll add instructions on what it'd take to switch between the two next to the aws_key_pair resource and we can decide then. I don't feel strongly either way.

- name: Checkout test-plans
uses: actions/checkout@v3
with:
repository: libp2p/test-plans
ref: ${{ inputs.ref || github.ref }}
- id: server
name: Provision server
run: echo "id=$(make provision-server | tail -n 1)" >> $GITHUB_OUTPUT
- id: client
name: Provision client
run: echo "id=$(make provision-client | tail -n 1)" >> $GITHUB_OUTPUT
- id: ip
name: Wait for client/server IP
env:
SERVER_ID: ${{ steps.server.outputs.id }}
CLIENT_ID: ${{ steps.client.outputs.id }}
run: |
read SERVER_IP CLIENT_IP <<< $(make wait SERVER_ID=$SERVER_ID CLIENT_ID=$CLIENT_ID | tail -n 1)
echo "server=$SERVER_IP" >> $GITHUB_OUTPUT
echo "client=$CLIENT_IP" >> $GITHUB_OUTPUT
- name: Download dependencies
run: npm ci
working-directory: perf/runner
- name: Run tests
env:
SERVER_IP: ${{ steps.ip.outputs.server }}
CLIENT_IP: ${{ steps.ip.outputs.client }}
run: npm run start -- --client-public-ip $CLIENT_IP --server-public-ip $SERVER_IP
working-directory: perf/runner
- name: Archive results
uses: actions/upload-artifact@v2
with:
name: results
path: perf/runner/benchmark-results.json
- id: s3
name: Upload results
env:
AWS_BUCKET: ${{ inputs.aws-bucket || vars.PERF_AWS_BUCKET }}
AWS_BUCKET_PATH: ${{ github.repository }}/${{ github.run_id }}/${{ github.run_attempt }}/benchmark-results.json
run: |
aws s3 cp benchmark-results.json s3://$AWS_BUCKET/$AWS_BUCKET_PATH --acl public-read --region us-west-2
echo "url=https://$AWS_BUCKET.s3.amazonaws.com/$AWS_BUCKET_PATH" >> $GITHUB_OUTPUT
working-directory: perf/runner
Comment on lines +95 to +108
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these steps are necessary. See comment above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I'll replace it with a push.

- name: Set summary
env:
URL: ${{ steps.s3.outputs.url }}
run: echo "$URL" >> $GITHUB_STEP_SUMMARY
- name: Deprovision client
if: always() && steps.client.outputs.id != ''
env:
CLIENT_ID: ${{ steps.client.outputs.id }}
run: make deprovision-client CLIENT_ID=$CLIENT_ID
- name: Deprovision server
if: always() && steps.server.outputs.id != ''
env:
SERVER_ID: ${{ steps.server.outputs.id }}
run: make deprovision-server SERVER_ID=$SERVER_ID
26 changes: 26 additions & 0 deletions perf/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
ssh-keygen:
ssh-keygen -t ed25519 -f ./terraform/region/files/user -N ''

ssh-add:
ssh-add ./terraform/region/files/user

provision-server:
aws ec2 run-instances --region=us-west-2 --launch-template LaunchTemplateName=perf-node --count 1 --query 'Instances[0].InstanceId' --output text

provision-client:
aws ec2 run-instances --region=us-east-1 --launch-template LaunchTemplateName=perf-node --count 1 --query 'Instances[0].InstanceId' --output text

wait:
aws ec2 wait instance-running --region us-west-2 --instance-ids "$(SERVER_ID)"
aws ec2 wait instance-running --region us-east-1 --instance-ids "$(CLIENT_ID)"
echo "$$(aws ec2 describe-instances --region us-west-2 --instance-ids "$(SERVER_ID)" --query 'Reservations[0].Instances[0].PublicIpAddress' --output text)" "$$(aws ec2 describe-instances --region us-east-1 --instance-ids "$(CLIENT_ID)" --query 'Reservations[0].Instances[0].PublicIpAddress' --output text)"

deprovision-server:
aws ec2 terminate-instances --region=us-west-2 --instance-ids "$(SERVER_ID)"

deprovision-client:
aws ec2 terminate-instances --region=us-east-1 --instance-ids "$(CLIENT_ID)"

# https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html
scale-down:
sam local invoke ScaleDown --template terraform/common/files/scale_down.yml --event terraform/common/files/scale_down.json
Comment on lines +24 to +26
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this Make target called?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used it for testing the lambda code. I'll move this next to the where all the other lambda files are and add a more descriptive description.

38 changes: 34 additions & 4 deletions perf/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,21 +10,51 @@ Benchmark results can be visualized with https://observablehq.com/@mxinden-works

## Provision infrastructure

1. `cd terraform`
2. Save your public SSH key as the file `./user.pub`.
### Bootstrap
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Bootstrap
### Bootstrap (long-lived resources)


1. Save your public SSH key as the file `./regions/files/user.pub`; or generate a new key pair with `make ssh-keygen` and add it to your SSH agent with `make ssh-add`.
2. `cd terraform`
3. `terraform init`
4. `terraform apply`

#### [OPTIONAL] Limited AWS credentials

If you want to limit the AWS credentials used by subsequent steps, you can create Access Keys for the `perf` user that terraform created.

1. Go to https://console.aws.amazon.com/iamv2/home?#/users/details/perf?section=security_credentials
2. Create access key
3. Download `perf_accessKeys.csv`
4. Configure AWS CLI to use the credentials. For example:
```bash
export AWS_ACCESS_KEY_ID=$(cat perf_accessKeys.csv | tail -n 1 | cut -d, -f1)
export AWS_SECRET_ACCESS_KEY=$(cat perf_accessKeys.csv | tail -n 1 | cut -d, -f2)
```

### Nodes
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Nodes
### Start nodes (short-lived resources)


1. `SERVER_ID=$(make provision-server | tail -n 1)`
2. `CLIENT_ID=$(make provision-client | tail -n 1)`
3. `read SERVER_IP CLIENT_IP <<< $(make wait SERVER_ID=$SERVER_ID CLIENT_ID=$CLIENT_ID | tail -n 1)`

## Build and run implementations

_WARNING_: Running the perf tests might take a while.

1. `cd runner`
2. `npm ci`
3. `npm run start -- --client-public-ip $(terraform output -raw -state ../terraform/terraform.tfstate client_public_ip) --server-public-ip $(terraform output -raw -state ../terraform/terraform.tfstate server_public_ip)`
3. `npm run start -- --client-public-ip $CLIENT_IP --server-public-ip $SERVER_IP`

## Deprovision infrastructure

### Nodes

1. `make deprovision-client CLIENT_ID=$CLIENT_ID`
2. `make deprovision-server SERVER_ID=$SERVER_ID `

### Bootstrap

1. `cd terraform`
3. `terraform destroy`
2. `terraform destroy`

## Adding a new implementation

Expand Down
2 changes: 0 additions & 2 deletions perf/terraform/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,3 @@ override.tf.json
# Ignore CLI configuration files
.terraformrc
terraform.rc

*.pub
72 changes: 55 additions & 17 deletions perf/terraform/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions perf/terraform/common/files/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# generated ZIP for AWS Lambda
scale_down.zip
1 change: 1 addition & 0 deletions perf/terraform/common/files/scale_down.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{}
42 changes: 42 additions & 0 deletions perf/terraform/common/files/scale_down.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
import boto3
import os
import json
import datetime

regions = json.loads(os.environ['REGIONS']) # Assuming this is a JSON array
tags = json.loads(os.environ['TAGS']) # Assuming this is a JSON object
max_age_minutes = int(os.environ['MAX_AGE_MINUTES']) # Assuming this is an integer

def lambda_handler(event, context):
# iterate over all regions
for region in regions:
ec2 = boto3.client('ec2', region_name=region)

now = datetime.datetime.now(datetime.timezone.utc)

filters = [{'Name': 'instance-state-name', 'Values': ['running']}]
filters = filters + [{
'Name': 'tag:' + k,
'Values': [v]
} for k, v in tags.items()]

response = ec2.describe_instances(Filters=filters)

instances = []

for reservation in response['Reservations']:
for instance in reservation['Instances']:
launch_time = instance['LaunchTime']
instance_id = instance['InstanceId']

print(
f'Instance ID: {instance_id} has been running since {launch_time}.')

if launch_time < now - datetime.timedelta(minutes=max_age_minutes):
print(
f'Instance ID: {instance_id} has been running for more than {max_age_minutes} minutes.')
instances.append(instance_id)

if instances:
ec2.terminate_instances(InstanceIds=instances)
print(f'Terminating instances: {instances}')
19 changes: 19 additions & 0 deletions perf/terraform/common/files/scale_down.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: An AWS Lambda application.

Resources:
ScaleDown:
Type: AWS::Serverless::Function
Properties:
Handler: scale_down.lambda_handler
Runtime: python3.9
CodeUri: .
Environment:
Variables:
REGIONS: '["us-west-2", "us-east-1"]'
TAGS: '{"Project":"perf", "Name":"node"}'
MAX_AGE_MINUTES: '30'
Policies:
- AmazonEC2FullAccess
Timeout: 30
Loading