Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: query: Expose query API request counts #1420

Closed
wants to merge 5 commits into from

Conversation

kakkoyun
Copy link
Member

@kakkoyun kakkoyun commented Aug 15, 2019

Thanos Query API now exposes thanos_query_api_instant_query_total, thanos_query_api_instant_query_failures_total, thanos_query_api_range_query_total and thanos_query_api_range_query_failures_total metrics to track error rates of query API.

It also attempts to change the behavior of thanos_query_api_instant_query_duration_seconds and thanos_query_api_range_query_duration_seconds, to track the duration of failed queries as well. (Still not sure about this one, since it'd change existing behavior.

Changes

  • Expose new query API metrics: thanos_query_api_instant_query_total, thanos_query_api_instant_query_failures_total, thanos_query_api_range_query_total and thanos_query_api_range_query_failures_total.

Verification

None yet.

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
@kakkoyun kakkoyun force-pushed the new_querier_metrics branch from 87e215d to 0a2dba6 Compare August 15, 2019 13:37
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
@kakkoyun kakkoyun force-pushed the new_querier_metrics branch from 00ac560 to 96daafb Compare August 15, 2019 13:42
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
How this relates to standard HTTP metrics defined for each API call here:

ins := extpromhttp.NewInstrumentationMiddleware(reg)

Wonder if it is not duplication bit ): If it is definitely there is some work on docs / example alerts (:

@kakkoyun
Copy link
Member Author

@bwplotka You're definitely right. Apparently, I have overlooked it. In this sense, all the metrics in this component can be replaced by middleware metrics.

http_requests_total{job="thanos-querier", handler="query"}
http_requests_total{job="thanos-querier", handler="query"}
http_request_duration_seconds_bucket{job="thanos-querier", handler="query"}
http_request_duration_seconds_bucket{job="thanos-querier", handler="query_range"}

If you don't have backwards compatibility concerns, could I remove thanos_query_api_instant_query_duration_seconds and thanos_query_api_range_query_duration_second, since they are also duplicates?

Maybe it's not the right place when I was checking code base for this PR, I have realized we have similar metrics for receive component

requestDuration: prometheus.NewHistogramVec(
.
Should we use
func NewInstrumentationMiddleware(reg *prometheus.Registry) InstrumentationMiddleware {
for receive as well?

I could also help with documentation of metrics and dashboards if it's needed. I recently worked on https://github.com/metalmatze/kube-thanos/tree/master/jsonnet/thanos-mixin, I can help while my caches are hot.

cc @brancz

@brancz
Copy link
Member

brancz commented Aug 16, 2019

All of the above sounds good to me! (no compatibility concerns)

@bwplotka
Copy link
Member

Fully agree here. We are still 0.x releases so it's actually the best moment to define the proper consistent approach here! Makes sense @kakkoyun

@kakkoyun
Copy link
Member Author

Cool, then I'll close this one and focus on consistent instrumentation in another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants