-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow setting granularity for summary metrics #290
allow setting granularity for summary metrics #290
Conversation
Can you explain more why you need to tweak this? In general, I would rather encourage using histograms than add more features for summaries. With histograms, these decisions can be made at query time, with more options for aggregation too. |
Sure. This is for a metric for the response size of a request, where we sometimes get a very high response size for a few requests, and then a smaller one for others. Sometimes there can be enough requests with a large enough size that they dominate the graph for the whole 10 minutes that the max age is set to by default, which is not what we want. The summary metrics instead of histograms is necessary because the values don't fit into histogram buckets nicely because there is quite a big spread from the smallest to the biggest values. |
You can add more/larger histogram buckets … but I see your point. I'm okay with adding this, but I would like to make it clearer in the mapper config that these settings are only relevant to summaries. What do you think about moving them into a What happens if they are not set? Can you please add tests that include these settings? Just that they get parsed from the mapper correctly, even in mixed configurations. |
I'm fine with moving them into a But if moving them into a
If they are not set, they will be assigned the default settings here, which matches what happens currently.
Sure, I'll try to add some tests. |
ah good point about the existing quantiles setting … now I'm thinking about consistency and moving those into sub structures too … the options I see are
I am willing to break compatibility, but I don't want to do that repeatedly, so I would either keep compatibility (1.i,1.ii) or go all the way and restructure these settings all at once (2.ii). In the latter case, we could also use the presence of summary/histogram settings to decide which one the user prefers. And if we're breaking that we may as well make histograms the default … As a user, how would you want the mapping to look? |
Thanks for writing this down. I can't think of any other options either.
I think I would prefer either option
Not entirely sure, but I don't think it would be too much work. I'm happy to give that a try if that's what we decide we want to go for. I think compatibility here is somewhat important, possibly more than whether histograms or summary is the default type, as I can't see a good transition that doesn't involve updating the config and the binary at the same type for this change (but I think it can be done for histogram vs summary switchovers).
Some hours I think. I would like to do the right thing here, that we and the users are going to be happy about. |
is updating the config and binary at the same time a problem? I was working under the assumption that that's the easy way 😅 |
I think it might be tricky in some cases, especially if there are multiple teams involved that might have their own metrics. I think for our usecase it might actually be possible, though not trivial. Not sure how others manage the config/binary though. |
I implemented option The code could still use some cleanup, and I didn't exactly make this a nice series of commits. Also the docs still need to be updated for the new hierarchies. I'm happy to do all that, but would first like to get some feedback on this approach. If you're happy with this option I'll clean it up a bit, including the docs. |
Thank you! I will try to go through this in detail either today or tomorrow.
…On Wed, Jan 29, 2020 at 5:41 PM Thomas Gummerer ***@***.***> wrote:
I implemented option 2.ii now, making it backwards compatible as well,
with a deprecation warning. I started off implementing option 1.i, but
then I tried implementing warnings for using the wrong options for the
wrong timer_type, and it seemed to get hard to explain to users. So I
think separate hierarchies might actually be best.
The code could still use some cleanup, and I didn't exactly make this a
nice series of commits. Also the docs still need to be updated for the new
hierarchies. I'm happy to do all that, but would first like to get some
feedback on this approach. If you're happy with this option I'll clean it
up a bit, including the docs.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#290?email_source=notifications&email_token=AABAEBU3D355FVZRJDALNW3RAG5TDA5CNFSM4KMDFVT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKIC54Q#issuecomment-579874546>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABAEBRDJHHFZD2E4MHTHETRAG5TDANCNFSM4KMDFVTQ>
.
|
@matthiasr friendly ping? I updated the docs as well now, and I'm happy to turn this into a nicer series of commits if you prefer that. |
I'm sorry, work was a bit much this week and I didn't have enough focus to review this. I didn't forget about it! |
No worries, totally understand that. I just wanted to make sure it's not forgotten, but it's no problem if you need to hold of on taking a look at this for a bit longer 💖 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Only one thing that I would like to change: I would rather have a clean break in the mapper
library interface than the ambiguity of mapping.Quantiles
and mapping.SummaryOptions.Quantiles
both being maybe-present (same for buckets).
To achieve that clean break (while keeping the configuration compatible), I would
- rename the
Buckets
andQuantiles
fields in the Mapping type (LegacyX or something) to make compilation fail for library users that rely on them - copy from these legacy fields into SummaryOptions/HistogramOptions rather than out
- always return mappings with SummaryOptions/HistogramOptions non-nil (does it even have to be a pointer?) so that it's safe to use
mapping.SummaryOptions.Quantiles
That way, the interface is unambiguous (don't use the legacy fields, use the Options) and there is a clear migration path for library users (s/\.Quantiles/.SummaryOptions&/g;s/\.Buckets/.Histogramoptions&/g
). The configuration would still be compatible because the YAML field name doesn't need to change.
Currently the Buckets and Quantiles settings are top level settings per metric in the yaml. In a subsequent commit we're going to allow adding more such options for summaries, at which point having them all at the top level gets confusing. Split the options out into separate hierarchies to allow adding more options, without adding confusion. We preserve backwards compatibility by still accepting the old option, but warning when it is present. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
The Go client for prometheus aggregates summary metrics over 10 minutes by default, in 5 buckets. This is not always the behaviour we want. Allow tweaking those settings in `statsd_exporter`, so we can aggregate summary metrics over more or less time, with more or fewer buckets, and set the cap for the bucket as well. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
15096ed
to
dae5d78
Compare
Thanks for the review! I updated things according to your suggestions, so I think this is ready for another round of reviews. I also rebased the changes to make two nicer commits, and added my sign-off.
I do think it has to be a pointer, because that's how we detect it not being set to emit warnings, etc. For now it is only safe to use them when the timer type corresponds to the options type, but I can always make them non-nil if you prefer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you!
Thanks for the reviews ✨ |
Signed-off-by: Matthias Rampke <mr@soundcloud.com>
The Go client for prometheus aggregates summary metrics over 10
minutes by default, in 5 buckets. This is not always the behaviour we
want.
Allow tweaking those settings in
statsd_exporter
, so we canaggregate summary metrics over more or less time, with more or fewer
buckets, and set the cap for the bucket as well.
/cc @matthiasr