-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
smartconnpool: Better handling for idle expiration #17757
Conversation
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Signed-off-by: Vicent Marti <vmg@strn.cat>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17757 +/- ##
==========================================
+ Coverage 67.95% 68.02% +0.06%
==========================================
Files 1586 1587 +1
Lines 255227 255579 +352
==========================================
+ Hits 173445 173848 +403
+ Misses 81782 81731 -51 ☔ View full report in Codecov by Sentry. |
Hello! 👋 This Pull Request is now handled by arewefastyet. The current HEAD and future commits will be benchmarked. You can find the performance comparison on the arewefastyet website. |
Also marked it for backporting since it was a performance regression here in certain edge cases. |
Signed-off-by: Vicent Marti <vmg@strn.cat>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an awesome change! Improves idle connection handling significantly. 🚀
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Description
We've had a report of a performance regression when migrating from an old Vitess version to a newer version that contains the new smart connection pool implementation. The system in question has inconsistent (spiky) load patterns and quite stringent idle timeouts for connections.
Our assumption here is that the regression is caused by a combination of two factors: the new connection pool provides connections in a LIFO pattern (where the most recently returned to the pool is returned to the client on subsequent gets). This is as opposed to the old pool, which was FIFO, meaning that the pool would constantly cycle through all the available connections. In systems that are not at full capacity, the LIFO connection pool will continuously re-use a small subset of all the open connections. This improves performance in
mysqld
, because of data locality, but causes the connections idling in the pool to eventually hit their max idle timeout and be closed by the background worker.In this situation, where the delta between requests/second during normal operation and peak operation is sufficiently large, we end up with a very expensive "background stall" in the background worker that is closing idle connections, because a large group of connections can expire at once. Even when the delta is not particularly big, the stringent idle timeout for the connections causes the background worker to trigger very frequently, and because the way it is implemented, these triggers cause a micro-stall.
We believe that improving the way the background worker for idle connections works can get rid of all these stalls.
The issue with the current implementation is that it pops all the connections from each stack on each idle check. Once all the connections have been popped, the worker iterates the list of connections and returns the ones that are not idle to the stack in reverse order (so the resulting stack has the same order as before popping it). In retrospect, this is not a good implementation and periodically popping the whole connection stack will obviously lead to micro-stalls in busy systems with stringent timeouts, as incoming requests are very likely to collide with the idle checks in the background, encountering an empty stack and having to fall back to the slower path in the wait queue.
The proposed implementation in this PR fixes the issue as follows:
timestamp
type that stores monotonic time points in just 8 bytes. This now replaces thetime.Time
timestamps (24 bytes) that we were using to track idle expiration dates in the pooled connections. Besides the significant memory savings, the key advantage of this timestamp is that it can be updated atomically. See the implementation intimestamp.go
.This removes all the micro and macro stalls caused by the background worker, as the idle expiration algorithm is now wait-free: the background worker does a best-effort expiry of connections without actually contending on the connection stack, and the foreground clients cooperate with the expiration by ignoring any expired connections that are popped from a stack. The expensive close operation always happens in the background. A fixed amount of overhead is added to certain
Get
cases because now popping a connection from a stack can spuriously fail (when the popped connection is expired), and must be retried. Retrying the pop is an extremely cheap operation, so I believe this won't have an effect in the p99 of the requests, and it will certainly be a no-op in connection pools without idle timeouts, where the popped connections can never be expired.An improvement on top of this implementation which I've discarded is upgrading the background worker to also do best-effort eviction of stale connections in the stacks. I'm opting not to pursue this because the semantics of removing nodes from the middle of the atomic stack are non-obvious, and they will cause contention even if we get them right. I think having the clients cooperate by ignoring expired connections is by far the most elegant and safest approach.
cc @harshit-gangal @deepthi
Related Issue(s)
Checklist
Deployment Notes