-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix database performance of read/write worker locks #16061
Conversation
We were seeing serialization errors when taking out multiple read locks. The transactions were retried, so isn't causing any failures. Introduced in #15782.
self._clock.looping_call( | ||
self._reap_stale_read_write_locks, _LOCK_TIMEOUT_MS / 10.0 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is every worker going to be doing this reaping? I guess that's fine since the reaps are idempotent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point but I don't think it matters in practice.
@@ -216,6 +219,7 @@ async def try_acquire_read_write_lock( | |||
lock_name, | |||
lock_key, | |||
write, | |||
db_autocommit=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a comment? Why is it unhelpful to run this function in a transaction?
I think:
- we're still using isolation level REPEATABLE READ
- but we're not in a transaction, so if it fails we won't bother to retry
- which means we'll properly wait for the lock renewal period before trying to lock again, thrashing the DB less?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly: there's not much point doing a transaction for a single query as it basically has the same semantics, and by doing auto commit we reduce the work as we don't have to start/end the transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh: and means the length of time we hold the lock on those rows is reduced to just the query execution time, which helps reduce the likelihood of clashes
""" | ||
|
||
if isinstance(self.database_engine, PostgresEngine): | ||
# For Postgres we can send these queries at the same time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the point just that we now always send these as two separate queries (deletes in a background process?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, which means they're much less likely to conflict and cause retries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM (assuming I've understood correctly).
We were seeing serialization errors when taking out multiple read locks.
The transactions were retried, so isn't causing any failures.
Introduced in #15782.