-
Notifications
You must be signed in to change notification settings - Fork 103
Conversation
main.go
Outdated
mux := http.NewServeMux() | ||
mux.Handle("/metrics", exporter) | ||
srv := &http.Server{ | ||
Addr: ":9999", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enisoc I want to get your opinion on how we should configure this. My initial guess would be to just use an environment variable, but I wanted to discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a flag to configure this.
"os" | ||
"os/signal" | ||
"sync" | ||
"syscall" | ||
"time" | ||
|
||
"github.com/0xRLG/ocworkqueue" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a benefit to going via OpenCensus instead of reusing the Prometheus provider from core k8s? Whenever possible, I prefer to reuse code and patterns from the core since they're well-exercised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't even know that existed I've been using this package for a long time now and I've only just made it open source. There isn't a huge difference between the two plain old prometheus and opencensus. Both result in Prometheus metrics being exposed on the debug endpoint. I think that OpenCensus ends up being cheaper to collect if you don't report it anywhere but we don't do that in this PR.
To be honest I think the fact that this project is in GoogleCloudPlatform and OpenCensus is being driven by Google people the two together made sense to me.
I really don't mind updating the PR to use the prometheus exporter from client-go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright that makes sense. Let's just keep OpenCensus. This PR should be good to go then whenever you have a chance to make the other changes below.
main.go
Outdated
@@ -42,6 +48,7 @@ import ( | |||
var ( | |||
discoveryInterval = flag.Duration("discovery-interval", 30*time.Second, "How often to refresh discovery cache to pick up newly-installed resources") | |||
informerRelist = flag.Duration("cache-flush-interval", 30*time.Minute, "How often to flush local caches and relist objects from the API server") | |||
debugAddr = flag.String("debug-addr", "localhost:9999", "The address to bind the debug http endpoints") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binding to localhost
by default means it won't be accessible from outside the Pod, doesn't it? Is that intentional? It looks like earlier you had just ":9999"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think you're right I'll change it.
main.go
Outdated
Handler: mux, | ||
} | ||
go func() { | ||
glog.Fatalf("cannot serve debug endpoint: %v", srv.ListenAndServe()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be logged but not fatal? The debug endpoint isn't strictly necessary for Metacontroller to perform its main duties. If I'm relying on Metacontroller to manage my app, I wouldn't want it to give up just because, for example, the metrics endpoint couldn't bind due to a port conflict.
In any case, we should probably ignore ErrServerClosed
here because we call Shutdown()
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right I remember fighting with this log line quite a bit I should have paid more attention. I'll change this to an Errorf
a87d998
to
527886a
Compare
This CL adds an http endpoint with one handler "/metrics" that serves Prometheus metrics exported by MetaController. The metrics are collected using OpenCensus and exported using the Prometheus HTTP exposition format. This CL also configures the Kubernetes workqueue package to collect its statistics and expose them via the Prometheus http endpoint. We collect and expose all of the workqueue metrics (Depth, Adds, Latency, WorkDuration, Retries) tagged with the name of the queue.
I addressed the comments and rebased this off master. I think the I ran: $ git checkout origin/master Gopkg.lock
$ rm -rf vendor/
$ dep ensure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
This CL adds an http endpoint with one handler "/metrics" that serves
Prometheus metrics exported by MetaController. The metrics are collected
using OpenCensus and exported using the Prometheus HTTP exposition
format.
This CL also configures the Kubernetes workqueue package to collect its
statistics and expose them via the Prometheus http endpoint. We collect
and expose all of the workqueue metrics (Depth, Adds, Latency,
WorkDuration, Retries) tagged with the name of the queue.