-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: fix meta/has_table #6627
SQL: fix meta/has_table #6627
Conversation
try: | ||
import pymysql | ||
self.driver = 'pymysql' | ||
def setUp(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you want to have a test setup()
method to install modules and connect (then can define the setup_imports()
and setup_connection()
or something in the sub class of the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am no sure I fully understand what you mean.
Something like this? (just to clean up the code)
class _TestSQLAlchemy(PandasSQLTest):
def set_up(self):
setup_import()
setup_driver()
setup_connect()
def setup_import():
if not SQLALCHEMY_INSTALLED:
raise nose.SkipTest('SQLAlchemy not installed')
def setup_connect():
try:
self.conn = self.connect()
self.pandasSQL = sql.PandasSQLAlchemy(self.conn)
except sqlalchemy.exc.OperationalError:
raise nose.SkipTest("Can't connect to server")
class TestPostgreSQLAlchemy(_TestSQLAlchemy):
def setup_driver()
try:
import psycopg2
self.driver = 'psycopg2'
except ImportError:
raise nose.SkipTest('psycopg2 not installed')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep that's what I meant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, ik keep this in mind for a pr to clean up the tests
@jorisvandenbossche I'm not 100% sure it's a good idea to always clear and reflect when you need Meta, in cases where you have a lot of tables it can take a few seconds. On the other hand you are right that if someone starts mixing in raw SQL (which i guess is very likely) we have no way of know whether Meta needs updating or not :/ I guess we can try it as a property and dive into potential performance issues later |
can you invalidate the metadata |
@mangecoeur I agree it seems not the best idea to always clear and reflect the meta, but I do not see another way to prevent that the meta will not be up to date anymore if one has executed some raw SQL code. @jreback But the problem is that we don't know when the user does raw stuff. They could do it with Maybe another way to prevent the possible performance issues is to limit the cases that |
what happens if you only reflect the meta once I think this is users issue |
It is indeed maybe a user issue, but I think a common issue (as we already did it ourselves in the test suite (https://github.com/jorisvandenbossche/pandas/blob/sql2/pandas/io/tests/test_sql.py#L109), why one of the tests was not really working), so if we can easily prevent it, why not? But I can see an argument to leave this as is, as it is the same with sqlalchemy, and we would just copy sqlalchemy behaviour then. |
… has_table Because meta is not automatically updated, has_table will fail when a table is dropped with an sql query (not with drop_table function)
@mangecoeur @hayd @jreback Anybody a strong opinion on this? |
I think should be a passed in option when creating the engine defaulting to True is ok |
Pushed this to 0.14.1, as the OO api is also not yet finalized (and this only concerns that, not the functional api) |
On refelction I like the idea of invalidating meta if the user does something "manual" like using raw SQL, but leave scope open for adding more useful functions that take advantage of SQLs expression API (or these could be added by people/projects who want to subclass the SQLAlchemy api class). Like I mentioned in another issue, it's important to make it possible for a user to manually manage reflection of metadata/supply their own metadata object for advanced use cases. |
Perhaps raise a UserWarning if they have done something manual? Say that meta could be invalid / user should reflect ? |
@hayd I think that might make it too easy for people to shoot themselves in the foot. Perhaps the best is to always reflect but have a class "reflect" or "get meta" function. An advanced user can just create a subclass and override that function. |
@jorisvandenbossche keep this open? |
Closing for now. In any case not applicable as is, but I should have a closer look about how |
Further work for cleaning up the sql code (#6292).
meta
attribute is not updated automatically, with the consequence that eg when you delete a table from sql directly, thehas_table
function does not work anymore:meta
attribute to one which is always updated when called. @mangecoeur looks OK?raise nose.SkipTest
inside the setup function of the test class. -> moved and merged this in TST: skip sql tests if connection to server fails #6651