-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests for Deprecate SparseArray for python 3.6 and 3.7 and fixes to other deprecation tests #30799
Conversation
Some notes on the deprecation logic, and how it appears to users. First, it appears that if you are running the python interpreter, you only get one deprecation warning per class, but if you use ipython, you get it each time. First, here are things with python. Note that a repeat usage does not show the deprecation warning. So I restart python each time to test different invocations.
Now, with ipython, you get the deprecation warning each time.
Now, the results with python 3.6. Note that I could not get
And now ipython 3.6:
So the behaviors (with this pull request) are what we want, and I just don't know how to handle the Also, given that My suggestion is that you merge in this pull request, and we could create an issue to track two things:
|
A little bit more on the warnings if someone tries to import a deprecated class:
python 3.6 doesn't warn when you import, but does warn when you use them:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
pandas/tests/api/test_api.py
Outdated
if compat.PY37: | ||
with warnings.catch_warnings(): | ||
warnings.simplefilter("ignore", FutureWarning) | ||
assert datetime(2015, 1, 2, 0, 0) == pd.datetime(2015, 1, 2, 0, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only test this for python 3.7? The equality should also hold for python 3.6 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my analysis above. I couldn't figure out how to make pd.datetime
have a deprecation message when doing pd.datetime(2015,1,2,0,0)
and also support pd.datetime.now()
with a deprecation message when using python 3.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this test should also pass when there is no warning? (the test doesn't assert there is a warning, only that the functionality is still working)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with python 3.6 and using pd.datetime(2015,1,2,0,0)
is that the implementation in this PR throws an exception. Current master just succeeds without any warning at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a correct result is more important right now than having the deprecation warning in all cases.
pandas/__init__.py
Outdated
|
||
class SparseArray: | ||
pass | ||
class __SparseArray(pandas.core.arrays.sparse.SparseArray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we can do it this way for SparseArray, as that breaks isinstance checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To confirm, you mean something like
isinstance(pd.array([1, 2, 3], dtype="Sparse"), pd.SparseArray)
will be false? Since it's an instance of the parent pd.arrays.SparseArray
, not the child pd.SparseArray
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
(in general, deprecating moving classes is annoying)
For datetime
and np
the same is true of course, but I think it is much less likely that someone will do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is again another issue with python 3.6 handling the deprecation of a module attribute. That's why they implemented PEP 562 (https://www.python.org/dev/peps/pep-0562/) for python 3.7. Making a reference using python 3.6 to pd.SparseArray
(without a constructor) produce a warning would be quite involved (see the Rationale in the PEP discussion), and I don't think it is worth introducing that level of complexity to pandas to handle deprecation messages of a class for users of an older version of python.
It's why I suggested above to merge in this pull request, and we can leave this as an open issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's an issue with python 3.6, but that we still support 3.6 now is a reality we have to live with.
So my suggestion would be to simply don't try to raise a warning in python 3.6, and just import the existing class in that case (by the time we are going to enforce this deprecation, we maybe won't support py3.6 anymore, so then that is less of an issue). Or simply don't deprecate pd.SparseArray at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or simply don't deprecate pd.SparseArray at all.
That's probably my preference until we support just 3.7+.
Though @Dr-Irv this PR is also fixing other issues with the other deprecations on master? Taking a closer look now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomAugspurger Yes, there were other issues with deprecations on master that I addressed, especially in the testing code.
So IIUC, the only deprecation that's working correctly on master for py 3.6 is >>> pd.datetime
<class 'datetime.datetime'>
>>> pd.datetime.now()
datetime.datetime(2020, 1, 8, 9, 29, 32, 925293)
>>> pd.np
<pandas.__numpy object at 0x109f016d8>
>>> pd.np.ones(4)
__main__:1: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead
array([1., 1., 1., 1.]) essentially, the deprecation strategy we have for 3.6 doesn't work well for classes like SparseArray and |
@TomAugspurger @jorisvandenbossche I figured out how to make Here is what it looks like in ipython:
|
https://stackoverflow.com/questions/13135712/class-method-instancecheck-does-not-work basically implement |
@jorisvandenbossche @TomAugspurger I think I have it right now for datetime and SparseArray. This is in commit f1b89f4. Here's what things look like in python 3.6 using ipython: Python 3.6.7 (default, Dec 6 2019, 07:03:06) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.11.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import pandas as pd
In [2]: pd.datetime(2018, 10, 11, 2, 3)
C:\Anaconda3\envs\pandas36\Scripts\ipython:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime instead.
Out[2]: datetime.datetime(2018, 10, 11, 2, 3)
In [3]: pd.datetime.now()
C:\Anaconda3\envs\pandas36\Scripts\ipython:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime instead.
Out[3]: datetime.datetime(2020, 1, 8, 16, 16, 4, 650619)
In [4]: isinstance(pd.datetime(2018, 10, 11, 2, 3), pd.datetime)
C:\Anaconda3\envs\pandas36\Scripts\ipython:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime instead.
Out[4]: True
In [5]: pd.SparseArray([1,2,3])
C:\Anaconda3\envs\pandas36\Scripts\ipython:1: FutureWarning: The pandas.SparseArray class is deprecated and will be removed from pandas in a future version. Use pandas.arrays.SparseArray instead.
Out[5]:
[1, 2, 3]
Fill: 0
IntIndex
Indices: array([0, 1, 2])
In [6]: isinstance(pd.array([1, 2, 3], dtype="Sparse"), pd.SparseArray)
Out[6]: True And here it is for python 3.7: Python 3.7.6 | packaged by conda-forge | (default, Dec 26 2019, 23:30:11) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.10.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import pandas as pd
In [2]: pd.datetime(2018, 10, 11, 2, 3)
C:\Anaconda3\envs\pandas-dev\Scripts\ipython:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime module instead.
Out[2]: datetime.datetime(2018, 10, 11, 2, 3)
In [3]: pd.datetime.now()
C:\Anaconda3\envs\pandas-dev\Scripts\ipython:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime module instead.
Out[3]: datetime.datetime(2020, 1, 8, 16, 19, 39, 764998)
In [4]: isinstance(pd.datetime(2018, 10, 11, 2, 3), pd.datetime)
C:\Anaconda3\envs\pandas-dev\Scripts\ipython:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime module instead.
Out[4]: True
In [5]: pd.SparseArray([1,2,3])
C:\Anaconda3\envs\pandas-dev\Scripts\ipython:1: FutureWarning: The pandas.SparseArray class is deprecated and will be removed from pandas in a future version. Use pandas.arrays.SparseArray instead.
Out[5]:
[1, 2, 3]
Fill: 0
IntIndex
Indices: array([0, 1, 2])
In [6]: isinstance(pd.array([1, 2, 3], dtype="Sparse"), pd.SparseArray)
C:\Anaconda3\envs\pandas-dev\Scripts\ipython:1: FutureWarning: The pandas.SparseArray class is deprecated and will be removed from pandas in a future version. Use pandas.arrays.SparseArray instead.
Out[6]: True The one difference is now in python 3.6, if you refer to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! some stylistic things, ping on green.
@jreback all green |
lgtm. ping on green. |
@Dr-Irv looks like you do need those assignments to make mypy happy. |
thanks @Dr-Irv ok ping on green. |
@jreback all green |
thanks @Dr-Irv |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Turns out that the existing tests were not testing deprecation correctly for both python 3.7 and python 3.6, so had to change some of the code in
test_api.py
to make that work right.For
pd.SparseArray
in python 3.6, we can only issue a warning when the constructorpd.SparseArray
is used. But this is consistent withpd.datetime
andpd.np
with python 3.6, which will issue warnings when things likepd.datetime.now()
are called. However, for python 3.6, I could not figure out a way to issue a warning onpd.datetime(2015, 10, 11, 0, 0)
, so we may just have to live with that.