-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-37249-v24: backport transaction-level-pooling compatibility to v24 #771
Conversation
Apparently "#region ... #endregion" is used to trigger some special linter behavior, and I don't want to fight the battle of learning to disable it.
This was probably a bug that we've never happened to trigger.
This addresses a long-standing TODO comment in Database.query (dating back to the introduction Session and our first real attempt to reduce connection contention). In some cases this might seem to be transforming lazy iteration over database results into aggressive fetching, via calls to fetchall(), but for at least PostgreSQL it seems we were already doing _some_ aggressive fetching before without realizing it, see e.g. https://docs.sqlalchemy.org/en/14/core/connections.html#using-server-side-cursors-a-k-a-stream-results So what we're really doing in the new fetchalls() here is invoking SQLAlchemy's client-side row processing more aggressively. I don't think that's a big change, and in exchange we can now guarantee that we never return iterators to user code that are responsible for closing a connection when the user is "done" with them - a problematic definition for any garbage-collected entity, but a particular problem for iterators that might not ever be exhausted. This also include some typing adjustments for SQLAlchemy objects. Those aren't actually checked yet, since typing support in SQLAlchemy won't be out until it's 2.0 release.
Temporary tables are now always created inside transactions, and via the temporary_table context-manager-returning method, which has been moved to Database from Session. Session has been removed. This should bring us into compatibility with pgbouncer transaction-level pooling - we won't care if a connection we hold actually gets multiplexed onto a different database connection, as long as it doesn't happen during transactions.
Each of the gazillion small queries we run a butler startup was previously grabbing a new connection and then returning it. That was probably wasn't too bad, since it should have just been getting them from the pool (for PostgreSQL) or opening a local file (for SQLite), but this should still be better in both cases. It should also remove a lot of rollbacks, since SQLAlchemy emits those whenever connections are returned to the pool. SQLAlchemy's docs say those shouldn't matter in terms of performance, but they're still noisy.
Now that we're careful to consistently use the same connection, we shouldn't ever need more.
Making SQLite temp tables inside transactions leads to very long exclusive locks during QG generation, causing timeouts in (at least) ci_cpp.
We now have public final session() and transaction() methods that return nothing and protected _session() and _transaction() methods that return connections, reducing the need to assert on self._session_connection being not None and avoiding returning things to users that they don't need.
2968815
to
0ab0fb2
Compare
Codecov ReportBase: 84.52% // Head: 84.56% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## v24.0.x #771 +/- ##
===========================================
+ Coverage 84.52% 84.56% +0.03%
===========================================
Files 243 243
Lines 31854 31903 +49
Branches 5428 5464 +36
===========================================
+ Hits 26926 26978 +52
+ Misses 3749 3748 -1
+ Partials 1179 1177 -2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
42d67ed
to
c517410
Compare
Checklist