-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: k3s pod dashboard #60
feat: k3s pod dashboard #60
Conversation
This dashboard gives an overview of the k3s cluster as whole and a collapsable and repeatable section for each pod. The cpu metric was identified to be instantaneous cpu time in ns for a given second. This makes the metric a bit tricky to work with as it does not play nice with graphana/prometheus' rate intervals, but each value can be computed on the whole by dividing it by 1bn.
0594833
to
7937d85
Compare
}, | ||
"editorMode": "code", | ||
"exemplar": false, | ||
"expr": "k3s_pod_cpu{instance=~\"$instance\"} / 1000000000", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Supporterino For your consideration. The POD CPU does seem to be in ns
, but it seems to be an instantaneous measure for the current second at the time of submission.
Since it's a gauge
and not a counter
, we can't collect the changes over the graphana $__rate_interval
so instead we're just dividing the value by the number of ns
in a second to get the instantaneous cpu % usage.
I don't have a huge variety of workloads to press this value locally, but I did upload a bunch of pictures to immich
to trigger the ML container and was able to capture examples of the cpu hitting about 80%.
This seems to be the most clear & reasonable measure from what I've seen.
This change makes the variable values refresh on time range change so old pods don't show up anymore.
@Reanmachine dashboard Looks good for me. Just one thing could you rename the CPU graphs to usage since you are converting it to that |
This fix standardizes the names to `cpu usage` as that's the measurement we're showing. Also noticed the cpu gague had the old calculation so aligned it with the others and added the truenas tag.
LGTM. Ty for your Submission |
Following #57, this PR adds dashboards for the k3s metrics produced by TrueNAS' metrics exporter. This dashboard gives an overview of the k3s cluster as whole and a collapsible and repeatable section for each pod.
The cpu metric was identified to be instantaneous cpu time in ns for a given second. This makes the metric a bit tricky to work with as it does not play nice with graphana/prometheus' rate intervals, but each value can be computed on the whole by dividing it by 1bn.