-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse statsd lines with multiple metric bits #354
Conversation
@MerlinDMC Is this documented in the StatsD spec anywhere? I didn't realize that this was supported by the standard StatsD implementation. You'll need to also add documentation for this to the Telegraf StatsD README |
|
||
// Test that counters work | ||
valid_lines := []string{ | ||
"valid.multiple:0|ms|@0.1:1|ms", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more tests here? I'd like to see:
without sample rate:
"valid.multiple.nosample:0|ms|:1|ms:200|ms"
Duplicate measurements (also, document what would happen here):
"valid.multiple.duplicate:0|ms:1|ms:0|ms:1|ms"
Duplicate measurements and different types (documentation needed here too):
"valid.multiple.duplicate:1|c:1|c:2|c:1|c"
"valid.multiple.duplicate:1|h:1|h:2|h:1|h"
"valid.multiple.duplicate:1|s:1|s:2|s:1|s"
"valid.multiple.duplicate:1|g:1|g:2|g:1|g" <-- what happens here? gauge will be set to the final value?
Multiple measurement types w/ one bucket (is this supported?)
"valid.multiple.duplicate:1|c:1|ms:2|s:1|g"
c013397
to
e4ada04
Compare
@sparrc It doesn't seem to be documented in the spec. Also if "the spec" is https://github.com/b/statsd_spec - that seems to miss some other stuff as well - sample rate for timers for example which are in etsy/statsd since Feb 2013. The overall behaviour seems to be that metric name and data packets are separated by a colon and each data packet needs at least a value and a type separated by a pipe.
As soon as multiple data packets are on a single line the metric name is reused throughout them.
same as
To test/prove that behaviour you can run https://github.com/MerlinDMC/statsd-multiple-metrics-prove which will start two etsy/statsd instances and feed them with the same data and show their aggregated output on the console.
I've change the added test to feed two statsd instances and ensure the results are the same as well - so feeding single stats and multiple will compute to the same values in the end. I would want to squash this feature branch into a single commit once it's good for a merge if possible. |
@MerlinDMC I'm sorry for the late eyes on your latest changes, looks good to me, is it ready to merge? |
b4a91bf
to
e9fa7b4
Compare
@sparrc I did rebase and squash it on top of your current master branch. If the tests on circleci are good feel free to merge this. |
Thank you @MerlinDMC! I'll get this merged |
This resembles the parsing which is done in etsy/statsd to add multiple datapoints for a measurement in a single line.
https://github.com/etsy/statsd/blob/243a1f2a166c2d573f7582dc0c42f50f257e4150/stats.js#L225-L277
It seems as if some client applications use those possibilities to pack more data into each UDP packet and simply omit the repeating measurement name where possible.
An example for this is the Kamon.io statsd reporter which produces lines like these: