-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for kube_scheduler SLI metrics #15731
Add support for kube_scheduler SLI metrics #15731
Conversation
b5f71d1
to
f43d0be
Compare
Codecov Report
Flags with carried forward coverage won't be shown. Click here to find out more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changelog approved by docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far, just a couple of comments on my side
except Exception as e: | ||
self.log.debug("Unable to collect query slis endpoint: %s", e) | ||
return False | ||
self._slis_available = r.status_code != 404 and r.status_code != 403 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function is supposed to be called, then I think we want to ignore a 404, but at the very least log an error for a 403, as that means that their agent or environment is not properly configured and they should be made aware of that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few small comments!
|
||
def assert_metric(name, **kwargs): | ||
# Wrapper to keep assertions < 120 chars | ||
aggregator.assert_metric(NAMESPACE + name, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aggregator.assert_metric(NAMESPACE + name, **kwargs) | |
aggregator.assert_metric(f"{NAMESPACE}.{name}", **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we run tests in Python 2 I couldn't use f-strings, but I used format
instead
assert_metric('.slis.kubernetes_healthcheck', value=1, tags=['name:ping', 'type:healthz']) | ||
assert_metric( | ||
'.slis.kubernetes_healthchecks_total', value=2450, tags=['name:ping', 'status:success', 'type:healthz'] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_metric('.slis.kubernetes_healthcheck', value=1, tags=['name:ping', 'type:healthz']) | |
assert_metric( | |
'.slis.kubernetes_healthchecks_total', value=2450, tags=['name:ping', 'status:success', 'type:healthz'] | |
) | |
assert_metric('slis.kubernetes_healthcheck', value=1, tags=['name:ping', 'type:healthz']) | |
assert_metric( | |
'slis.kubernetes_healthchecks_total', value=2450, tags=['name:ping', 'status:success', 'type:healthz'] | |
) |
CHECK_NAME = 'kube_scheduler' | ||
NAMESPACE = 'kube_scheduler' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: personally I think you can just remove NAMESPACE
and use CHECK_NAME
.
|
||
@pytest.fixture() | ||
def mock_metrics(): | ||
f_name = os.path.join(os.path.dirname(__file__), 'fixtures', 'metrics_slis_1.27.3.txt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of os.path.dirname(__file__)
, you can call get_here()
. Example:
integrations-core/kong/tests/common.py
Line 11 in 3567074
HERE = get_here() |
The |
1 similar comment
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for agent integ!
4b479a2
to
563d9ef
Compare
What does this PR do?
This PR introduces two new scheduler metrics:
kube_scheduler.slis.kubernetes_healthcheck
andkube_scheduler.slis.kubernetes_healthchecks_total
.Motivation
Kubernetes v1.26 exposed a new
/metrics/slis
endpoint (reference here). This PR adds support for capturing the new metrics exposed for the scheduler:kubernetes_healthcheck
andkubernetes_healthcheck
.Additional Notes
Review checklist (to be filled by reviewers)
qa/skip-qa
label if the PR doesn't need to be tested during QA.