-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Original vs hedged request metrics #46
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly how I was planning to implement it 🙂
LGTM!
Thanks, then I will add tests to cover this case and release a new version soon. |
Really appreciate it @cristaloleg 🙌 |
Tests are added @dannykopping @joe-elliott . Not sure about thread above, IMO having 2 counters is much simpler and costs nothing, WDYT ? I will refactor tests in the next PR, looks too wordy now. |
🎉🎉🎉 |
I tried to integrate this, but long story short we have many hedged clients but only register the metrics once; this makes it difficult to use the stats because one client might have I solved this by using the return value of the roundtrip: func (rt *limitedHedgingRoundTripper) RoundTrip(req *http.Request) (*http.Response, error) {
isHedged := hedgedhttp.IsHedgedRequest(req)
if isHedged {
if !rt.limiter.Allow() {
totalRateLimitedHedgeRequests.Inc()
return nil, ErrTooManyHedgeRequests
}
totalHedgeRequests.Inc()
}
resp, err := rt.next.RoundTrip(req)
if err == nil {
if isHedged {
requestsWon.WithLabelValues("hedged").Inc()
} else {
requestsWon.WithLabelValues("original").Inc()
}
}
return resp, err
} Edit: here's the PR grafana/loki#10281 |
Thanks for the update! So...as I understood it's not library's problem but only a case that you've many hedged clients in 1 app. Am I right? BTW, any numbers how many wons? :D (such info is probably under NDA but I decided to try :D ) |
That's correct!
I'll update you once I roll this out to production next week 🙂 I'm sure I can send a % of hedging effectiveness with no problems. |
Sorry, no real formulas, someone with a better probability theory background should comment on that. But the intuition suggests that it should not be high (or even lower than that). Request hedging is only about tail latency, so basically 1-5% of the requests. I don't think there is are exact numbers for everyone. Sleep between calls, amount of hedged calls and success rate as a result heavily depends on the systems and it's behaviour. The only thing I can really suggest is to play with the numbers. Probably the target latency should be a const (so, SLO) and after that tweaking sleeps & amount should be adjusted to minimise numbers of calls but keeping latency at the desired level. My 2c. CC: @storozhukBM |
Fixes #42