Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Access] Fix race condition in connection cache #5074

Closed
peterargue opened this issue Nov 28, 2023 · 0 comments · Fixed by #5334
Closed

[Access] Fix race condition in connection cache #5074

peterargue opened this issue Nov 28, 2023 · 0 comments · Fixed by #5334
Assignees
Labels

Comments

@peterargue
Copy link
Contributor

peterargue commented Nov 28, 2023

Problem Definition

There are 2 race conditions in the connection cache:

  1. When a connection fails because the backend is unavailable, it is removed from the cache, and the connection is torn down. However, the connection is looked up in the cache by the backend's address, so it's possible that a different goroutine has already removed and recreated the connection for that server.
  2. When a new connection is added to the cache, it is added as an empty struct with its lock held. This allows an atomic GetOrAdd operation with minimal overhead. After the client is added, fields like the closeRequested are set. In the Close() method, the closeRequested value was checked outside of the lock, resulting in a segfault.

1 by itself would just result in temporary thrashing as the connection is torn down multiple times until it stabilizes. When combined with 2 it causes crashes each time that thrashing occurs.

2 is fixed in the v0.32 branch with #5073.

Fix 1 and backport 2 to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant