-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Expose client-go metrics #233
✨ Expose client-go metrics #233
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: DirectXMan12 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind feature |
250f34b
to
d06e3fb
Compare
@brancz btw, this first is why I'm hesitant to call the current client-go setup easily usable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is quite a mouthful. Nonetheless the way you’re using the metrics registry here it could just as easily be injected and then this could just be a package in client-go.
pkg/metrics/client_go_adapter.go
Outdated
|
||
// registerClientMetrics sets up the client latency metrics from client-go | ||
func registerClientMetrics() { | ||
var () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a leftover
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me. I have couple of questions.
|
||
// this section contains adapters, implementations, and other sundry organic, artisinally | ||
// hand-crafted syntax trees required to convince client-go that it actually wants to let | ||
// someone use its metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
pkg/metrics/client_go_adapter.go
Outdated
|
||
// NB(directxman12): these are changed to MustRegister from Register. It's not clear why they weren't | ||
// MustRegister in the first place, except maybe to not bring down the controller if the metrics fail | ||
// to register (which shouldn't happen unless there's a duplicate metric). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
||
// registerWorkQueueMetrics sets up workqueue (other reconcile) metrics | ||
func registerWorkqueueMetrics() { | ||
workqueuemetrics.SetProvider(workqueueMetricsProvider{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this means metrics from multiple controllers will get combined ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, if you look at how the provider is implemented, there's a label for the controller name
pkg/metrics/client_go_adapter.go
Outdated
@@ -0,0 +1,266 @@ | |||
/* | |||
Copyright 2016 The Kubernetes Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/2016/2018 (someone should build github suggestion bot for this :))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
}) | ||
Registry.MustRegister(retries) | ||
return retries | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These workqueue metrics have some overlap with the existing metrics. Shall we remove the higher level controller_runtime metrics ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's things that look like duplicates, but it's not clear to me if they actually are completely duplicate. Lets evaluate a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to kill QueueLength in favor of depth, although the fact that the client-go metrics don't expose queue names as labels is strange to me. The other two I think we want to keep.
@brancz do you have any insight on the subsystem vs label decision there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on this subject: kubernetes/kubernetes#71165
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, will on the comments, will fix a bit later today
pkg/metrics/client_go_adapter.go
Outdated
|
||
// registerClientMetrics sets up the client latency metrics from client-go | ||
func registerClientMetrics() { | ||
var () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is
pkg/metrics/client_go_adapter.go
Outdated
@@ -0,0 +1,266 @@ | |||
/* | |||
Copyright 2016 The Kubernetes Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
|
||
// registerWorkQueueMetrics sets up workqueue (other reconcile) metrics | ||
func registerWorkqueueMetrics() { | ||
workqueuemetrics.SetProvider(workqueueMetricsProvider{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, if you look at how the provider is implemented, there's a label for the controller name
}) | ||
Registry.MustRegister(retries) | ||
return retries | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's things that look like duplicates, but it's not clear to me if they actually are completely duplicate. Lets evaluate a bit.
1dfbd22
to
cd73474
Compare
cd73474
to
4c17676
Compare
Oh sure. I think this particular model is just super clunky ;-) Injected registry + refactor in client-go would definitely address this. |
Do you think you could take that on as a follow up, to contribute that to client-go? This is the second time I am reviewing a PR that implements more or less the same thing twice (first time was Prometheus implementing these metrics itself for its Kubernetes service discovery). I imagine many more want to do the same thing. Preferably even first in client-go and then move forward with this PR, but I understand if you first want to make progress with this. |
I can take a follow up to move some of this in client-go. I'd prefer to get this merged first, then I'll fix up client-go a bit, and then port it back over (it makes it easier since we might not immediately pull in the latest client-go release). |
96b2df6
to
a696eb8
Compare
This exposes the client-go metrics (client, workqueue, reflector) metrics in the controller-runtime registry.
This marks tests that have a "// TODO: write this" or similar as pending tests, so that we can keep track of them.
The QueueLength metric is now a duplicate of the workqueue "depth" metric, so it's no longer needed.
a696eb8
to
ed6a18c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments.
Other LGTM.
|
||
longestRunning = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
Subsystem: workQueueSubsystem, | ||
Name: "longest_running_processor_microseconds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
microseconds?
It seems this is not following the convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but the current code requires it atm. it's fixed in future releases
return retries.WithLabelValues(name) | ||
} | ||
|
||
func (workqueueMetricsProvider) NewLongestRunningProcessorMicrosecondsMetric(name string) workqueuemetrics.SettableGaugeMetric { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Microseconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current code requires it atm (see above)
/lgtm |
This exposes the client-go metrics (client, workqueue, reflector)
metrics in the controller-runtime registry.
It also fixes a typo in the tests, and fixes our Gopkg.toml to prune some unnecessary requires (since I had to do a
dep ensure
anyway, and it was taking forever without the changes).