-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(parsers.prometheus): Histogram infinity bucket not being generated when using protobuf protocol #11486
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me. Thanks for the fix @mmolnar!
If you can create an issue that describes the issue in more detail and link it here that would be super awesome.
Oh and maybe a test case to make sure we don't fail in the future?! |
There is not much more to say about this because I did not find the root cause, this is just an work-around for the varied behaviour of the underlying library prometheus/common/expfmt
I am not sure how would I do that because it seems the whole issue is environment dependent. Might have something to do with the fact that value of the "le" label is actually +Infinity value of golang type float64, sometimes it gets interpreted as NoN and dropped and sometimes it gets converted to string "+Inf". |
I have tried to look deeper into the issue and found the core of the problem. Issue is NOT environment dependent, but depends on the protocol being used. Some prometheus clients are capable of responding with protobuf protcol which telegraf requests by default. My test cases were different, on ubuntu i tried to simulate the response with nginx which was capable to respond only in text expostion format, but on debian I had a service responding in protobuf. Seems the prometheus protobuf exposition format expects Inf bucket to be optional, but it is not optional in the text exposition format, my pull request will fix this. There could be an test case done for this, here I have 2 questions:
|
@mmolnar I would just test if the Regarding the protobuf thingy, I'll bring up a discussion internally... Anyway, it would be nice to copy this (maybe even as-is) into an issue so people that search for it will easily find it and make their way to this fix... |
Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip. 👍 This pull request doesn't change the Telegraf binary size 📦 Click here to get additional PR build artifactsArtifact URLs |
Hey @mmolnar, sorry for the late reply! We discussed deprecating on how to handle this issue in an internal meeting and we think deprecating the protobuf part should be done in 2.0 as this is a potential breaking change. So the target setting is to get this fix in and deprecate the root cause later. Does that sound good? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks a bunch @mmolnar for hunting this down and submitting this fix!
Yes, thank you. |
In some cases the infinity bucked is not being parsed or generated from text prometheus input by the library github.com/prometheus/common/expfmt
I have not found the specifics, but it must be related to the version of libc because the same telegraf binary works on ubuntu with libc6:amd64 2.27-3ubuntu1.6, but does not work on debian with 2.28-10.
Because the infinity bucket has the same value as histogram sample count we can just enforce creation of this bucket if it is missing. They are doing basically the same thing in the underlying library used to parse metrics when generating text format from metrics.
resolves #11490