Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added initial CloudWatch Dashboard for RFS #1147

Merged
merged 6 commits into from
Nov 19, 2024

Conversation

chelma
Copy link
Member

@chelma chelma commented Nov 19, 2024

Description

  • Added a CloudWatch Dashboard for RFS that gets created when the RFS feature is enabled
  • The Target Cluster section relies on having metrics provided by an Amazon OpenSearch Service domain to populate. Additionally, the user must supply the name of their domain in the Dashboard.
  • One quirk is the mechanic we're using to represent how many "active" RFS Workers there are. We don't have a good way to represent the number of *desired* workers in a steady-state sense, so what we do is effectively report how many RFS Workers were alive in a given period by counting the number of separate instances of a CPU metric for our ECS containers. During steady-state operation, we'd expect the metric in that graph to resemble closely the user's desired worker count. However, if the workers are dying frequently (such as if there's no work), then new workers will be constantly popping up, reporting in, and dying. Each such worker adds a new entry to this sample count, and so the number skews higher than the desired number.

Issues Resolved

Testing

  • Created the dashboard in my account and used to visualize a few simple migrations:

Screenshot 2024-11-19 at 9 13 26 AM

Screenshot 2024-11-19 at 9 13 42 AM

Check List

  • New functionality includes testing
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Chris Helma <chelma+github@amazon.com>
Signed-off-by: Chris Helma <chelma+github@amazon.com>
Copy link

codecov bot commented Nov 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.78%. Comparing base (892990a) to head (c9df41b).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #1147   +/-   ##
=========================================
  Coverage     80.78%   80.78%           
  Complexity     2947     2947           
=========================================
  Files           399      399           
  Lines         15089    15089           
  Branches       1017     1017           
=========================================
  Hits          12190    12190           
  Misses         2288     2288           
  Partials        611      611           
Flag Coverage Δ
unittests 80.78% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

break;
}
}
console.log(`returning ${JSON.stringify(variables)}`);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably don't need this anymore

"inputType": "input",
"id": "ACCOUNT_ID",
"label": "ACCOUNT_ID",
"defaultValue": "ACCOUNT_ID",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, doesn't follow the pattern of the other placeholder values

}
}
},
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we run a JSON formatter here? it looks like it's a bit off

for (const variable of variables) {
if (variable.id === variableName) {
variable.defaultValue = defaultValue;
console.log(`changing ${variable.defaultValue} to ${defaultValue}`);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete this line too?

Signed-off-by: Chris Helma <chelma+github@amazon.com>
"property": "DomainName",
"inputType": "input",
"id": "TC_DOMAIN_NAME",
"label": "Target Cluster Domain Name",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on adding a search here for dropdown instead of input?

            "search": "{AWS/ES,ClientId,DomainName} MetricName=CPUUtilization",
            "populateFrom": "DomainName"

Signed-off-by: Chris Helma <chelma+github@amazon.com>
Signed-off-by: Chris Helma <chelma+github@amazon.com>
"view": "timeSeries",
"stacked": false,
"metrics": [
[ { "expression": "METRICS()/1000", "id": "e1", "region": "REGION" } ],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we need to do the same / PERIOD(m1)*60 here

"stacked": false,
"metrics": [
[ { "expression": "METRICS()/1000", "id": "e1", "region": "REGION" } ],
[ "AWS/ES", "IndexingRate", "DomainName", "TC_DOMAIN_NAME", "ClientId", "ACCOUNT_ID", { "region": "region", "label": "Document Ingested per 60 seconds - MIN: ${MIN}, MAX: ${MAX}, AVG: ${AVG}", "id": "m1", "visible": false } ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a (including replicas) here on the label

"period": 60,
"region": "REGION",
"stacked": false,
"title": "RFS Workers Reporting in During Period",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we remove During Period from the title here?

Signed-off-by: Chris Helma <chelma+github@amazon.com>
@chelma chelma merged commit fd11653 into opensearch-project:main Nov 19, 2024
17 checks passed
@chelma chelma deleted the MIGRATIONS-2132 branch November 19, 2024 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants