-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: deprecate usage of cursor.execute
statements in favor of the in class implementation of execute
.
#60748
base: main
Are you sure you want to change the base?
Conversation
04fee59
to
e9cbf63
Compare
Hello @WillAyd.
This should be no problem because we can always wrap that execution around a Would you mind taking a look and LMK what you think while in draft ? |
e9cbf63
to
ff41294
Compare
ff41294
to
9e0f436
Compare
…the in class implementation of `execute`.
9e0f436
to
804fb3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment but generally this looks good. @mroeschke can you take a look as well?
pandas/io/sql.py
Outdated
for stmt in self.table: | ||
conn.execute(stmt) | ||
self.pd_sql.execute(stmt).close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the need for .close()
here? I am hoping we can avoid anything that implicitly changes the transaction state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say there's a need for that, but it is usually a good practice to close any cursor that is opened. We can remove to maintain the status quo - no problem on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh OK I misread this as closing the transaction. Does whatever pd_sql.execute
return not follow the context manager protocol? We should be preferring with
statements whenever a context like that needs to be managed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, no worries. I'm super thankful for all the help and attentive comments.
Yeah, it does implement the context manager protocol. This was a personal choice and has no technical reason:
with self.pd_sql.run_transaction():
for stmt in self.table:
self.pd_sql.execute(stmt).close()
Reads better to me than
with self.pd_sql.run_transaction():
for stmt in self.table:
with self.pd_sql.execute(stmt):
pass
What you think ? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm always in favor of using the context manager over calling close manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looking at this some more I'm still somewhat concerned about the object lifetime management. For example, the SqliteDatabase implementation of execute
looks like this:
def execute(self, sql: str | Select | TextClause, params=None):
if not isinstance(sql, str):
raise TypeError("Query must be a string unless using sqlalchemy.")
args = [] if params is None else [params]
cur = self.con.cursor()
try:
cur.execute(sql, *args)
return cur
except Exception as exc:
...
So self.con.cursor()
is tied to the lifetime of the SqliteDatabase
instance - are you sure we should be closing this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Let me try to illustrate that with an example from the codebase.
We can say that the ADBCDatabase implementation is the same as of SQLiteDatabase
Lines 2110 to 2128 in c0c778b
def execute(self, sql: str | Select | TextClause, params=None): | |
if not isinstance(sql, str): | |
raise TypeError("Query must be a string unless using sqlalchemy.") | |
args = [] if params is None else [params] | |
cur = self.con.cursor() | |
try: | |
cur.execute(sql, *args) | |
return cur | |
except Exception as exc: | |
try: | |
self.con.rollback() | |
except Exception as inner_exc: # pragma: no cover | |
ex = DatabaseError( | |
f"Execution failed on sql: {sql}\n{exc}\nunable to rollback" | |
) | |
raise ex from inner_exc | |
ex = DatabaseError(f"Execution failed on sql '{sql}': {exc}") | |
raise ex from exc |
Alright, so now take this line as example:
sql_statement = f"DROP TABLE {table_name}"
self.execute(sql_statement)
If we leave it like that ☝️ then an exception is raised because because the cursor was left open:
I'm no specialist here but believe that despite all differences between drivers the DBAPI is respected and if that is the case then the SQLite cursor should also be closed. The SQLite cursor will anyways get closed when __del__
is called so we might be safer if we maintain the status quo in case we're uncertain about side-effects ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mroeschke , @WillAyd , just a heads up regarding self.pd_sql.execute(stmt).close()
vs with self.pd_sql.execute(stmt)
. The later is not feasible because the SQLite Cursor object does not implement the context manager protocol.
Options are calling .close()
or using closing
from contextlib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, so now take this line as example:
sql_statement = f"DROP TABLE {table_name}" self.execute(sql_statement)If we leave it like that ☝️ then an exception is raised because because the cursor was left open:
Thanks for this example. So previously we were explicitly opening and closing a cursor but with the switch to using self.execute
we are re-using the cursor attached to the class instance and not controlling its lifecycle.
So how is that lifecycle being managed? Seems like there is just a gap / inconsistency that is making this all more complicated than it should be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @WillAyd ,
I don't think we have a lifecycle management problem but nonetheless the implementation has changed as complicated is not the goal here.
The implementation is back to its original version with a small change in the name of things. cur
(instead of conn
) is used to represent a cursor since it is what run_transaction
returns for SQLiteDatabase
's implementation.
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.