You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug a clear and concise description of what the bug is.
We upgraded prometheus-kube-stack from 62.6.0 to 68.4.5 and observed the uptime metrics (apiserver_request:availability30d) of the api-server to be completely implausible.
What's your helm version?
Deployed via ArgoCD, which internally uses 3.15.4
What's your kubectl version?
Irrelevant
Which chart?
prometheus-kube-stack
What's the chart version?
68.4.5
What happened?
We observed the metrics of apiserver_request:availability30d to go from 99.999 % to way beyond 100% i.e. apiserver_request:availability30d{verb="all"} is currently at 1.6425321904704488 on one cluster and 2.225346243637766 on another.
If we take a look via Prometheus UI we can spot the exact time we initiated the update.
If we rollback the update, we can see the metrics going back to normal.
What you expected to happen?
No response
How to reproduce it?
No response
Enter the changed values of values.yaml?
No response
Enter the command that you execute and failing/misfunctioning.
No command necessary
Anything else we need to know?
ClusterVersion is v1.29.2
The text was updated successfully, but these errors were encountered:
Getting the same issue on version 68.1.1. API Server dashboard seems off with values going way above 100% after upgrading to version 68.1.1.
Also getting KubeAPIErrorBudgetBurn alerts after upgrade just like in this issue, seems like burnrate queries like apiserver_request:burnrate1d may have problems too.
Same here with the latest version of kube-prometheus-stack(69.2.0)
zeritti
changed the title
[prometheus-kube-stack] API server metrics broken after upgrade
[kube-prometheus-stack] API server metrics broken after upgrade
Feb 9, 2025
Getting the same issue on the latest chart version. The recording rule apiserver_request:availability30d{verb="read"} is not working. Seemingly because the sum by (cluster) (cluster_verb_scope_le:apiserver_request_sli_duration_seconds_bucket:increase30d{le=~"30(\\.0)?",scope="cluster",verb=~"LIST|GET"}) metric does not exist. Looks like the le=30 bucket is missing.
Describe the bug a clear and concise description of what the bug is.
We upgraded prometheus-kube-stack from
62.6.0
to68.4.5
and observed the uptime metrics (apiserver_request:availability30d
) of the api-server to be completely implausible.What's your helm version?
Deployed via ArgoCD, which internally uses 3.15.4
What's your kubectl version?
Irrelevant
Which chart?
prometheus-kube-stack
What's the chart version?
68.4.5
What happened?
We observed the metrics of
apiserver_request:availability30d
to go from99.999 %
to way beyond 100% i.e.apiserver_request:availability30d{verb="all"}
is currently at1.6425321904704488
on one cluster and2.225346243637766
on another.If we take a look via Prometheus UI we can spot the exact time we initiated the update.
If we rollback the update, we can see the metrics going back to normal.
What you expected to happen?
No response
How to reproduce it?
No response
Enter the changed values of values.yaml?
No response
Enter the command that you execute and failing/misfunctioning.
No command necessary
Anything else we need to know?
ClusterVersion is v1.29.2
The text was updated successfully, but these errors were encountered: