-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: GH17464 MultiIndex now raises an error when levels aren't unique, tests changed #17971
Conversation
Codecov Report
@@ Coverage Diff @@
## master #17971 +/- ##
==========================================
- Coverage 91.23% 91.22% -0.02%
==========================================
Files 163 163
Lines 50113 50115 +2
==========================================
- Hits 45723 45715 -8
- Misses 4390 4400 +10
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17971 +/- ##
==========================================
+ Coverage 91.44% 91.45% +<.01%
==========================================
Files 157 157
Lines 51378 51441 +63
==========================================
+ Hits 46985 47044 +59
- Misses 4393 4397 +4
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.21.0.txt
Outdated
- Bug in :func:`Series.rename` when called with a callable, incorrectly alters the name of the ``Series``, rather than the name of the ``Index``. (:issue:`17407`) | ||
- Bug in :func:`String.str_get` raises ``IndexError`` instead of inserting NaNs when using a negative index. (:issue:`17704`) | ||
- Bug in :func:`Series.rename` when called with a `callable`, incorrectly alters the name of the `Series`, rather than the name of the `Index`. (:issue:`17407`) | ||
- Bug in :func:`String.str_get` raises `index out of range` error instead of inserting NaNs when using a negative index. (:issue:`17704`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you changed these two entries? The previous versions look correct to me (double backticks instead of single, "callable" unticked, etc.).
doc/source/whatsnew/v0.21.0.txt
Outdated
- Bug in :func:`String.str_get` raises ``IndexError`` instead of inserting NaNs when using a negative index. (:issue:`17704`) | ||
- Bug in :func:`Series.rename` when called with a `callable`, incorrectly alters the name of the `Series`, rather than the name of the `Index`. (:issue:`17407`) | ||
- Bug in :func:`String.str_get` raises `index out of range` error instead of inserting NaNs when using a negative index. (:issue:`17704`) | ||
- When created with duplicate labels, ``MultiIndex`` now raises a `ValueError`. (:issue:`17464`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ValueError should have two backticks: ``ValueError``
@@ -1573,7 +1573,8 @@ def test_is_(self): | |||
# shouldn't change | |||
assert mi2.is_(mi) | |||
mi4 = mi3.view() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add an explicit test for the issue, e.g that you are raising (use the example from the original issue).
pandas/tests/indexes/test_multi.py
Outdated
@@ -1573,7 +1573,8 @@ def test_is_(self): | |||
# shouldn't change | |||
assert mi2.is_(mi) | |||
mi4 = mi3.view() | |||
mi4.set_levels([[1 for _ in range(10)], lrange(10)], inplace=True) | |||
# GH 17464 - Remove duplicate MultiIndex levels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need these comments here, rather they go on a new test (see above)
pandas/tests/indexes/test_multi.py
Outdated
ind.set_levels([['A', 'B', 'A', 'A', 'B'], [2, 1, 3, -2, 5]], | ||
inplace=True) | ||
|
||
# GH 17464 - Remove duplicate MultiIndex levels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok changing this, but remove the comment, and add the original as another example of where things should raise (in the test you are adding above)
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -978,6 +978,7 @@ Indexing | |||
- Bug in :meth:`DataFrame.first_valid_index` and :meth:`DataFrame.last_valid_index` when no valid entry (:issue:`17400`) | |||
- Bug in :func:`Series.rename` when called with a callable, incorrectly alters the name of the ``Series``, rather than the name of the ``Index``. (:issue:`17407`) | |||
- Bug in :func:`String.str_get` raises ``IndexError`` instead of inserting NaNs when using a negative index. (:issue:`17704`) | |||
- When created with duplicate labels, ``MultiIndex`` now raises a ``ValueError``. (:issue:`17464`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.22.0 breaking changes
Sure thing |
looks like you need to rebase against master, seems you have pulled in other commits
then force push |
My mistake, latest push should clear that up |
I rebased you. I still need to look at this for review. |
can you run the asv for multi and see if any perf changes with this? (prob no, but just checking) |
asv shows nasty perf changes though I think it may be a problem with my setup. I'll see if I can work it out. Here's the output for good measure.
|
can you rebase |
Rebased |
Looks like some tests from test_groupby were duplicated in test_functional, without the modifications. I updated the duplicated function in test_functional to get rid of the error - let me know if you think it would be better to remove all the duped tests instead. |
doc/source/whatsnew/v0.22.0.txt
Outdated
@@ -81,6 +81,9 @@ Other API Changes | |||
- Inserting missing values into indexes will work for all types of indexes and automatically insert the correct type of missing value (``NaN``, ``NaT``, etc.) regardless of the type passed in (:issue:`18295`) | |||
- Restricted ``DateOffset`` keyword arguments. Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, only valid arguments will be accepted. (:issue:`17176`, :issue:`18226`). | |||
- :func:`DataFrame.from_items` provides a more informative error message when passed scalar values (:issue:`17312`) | |||
- :class:`Timestamp` will no longer silently ignore unused or invalid `tz` or `tzinfo` arguments (:issue:`17690`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like some duplication here
pandas/core/indexes/multi.py
Outdated
@@ -198,6 +198,10 @@ def _verify_integrity(self, labels=None, levels=None): | |||
" level (%d). NOTE: this index is in an" | |||
" inconsistent state" % (i, label.max(), | |||
len(level))) | |||
if not level.is_unique: | |||
raise ValueError("Level values must be unique: {0}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use kwargs rather than positional here
@@ -67,8 +67,10 @@ def test_frame_describe_multikey(self): | |||
'C': 1, 'D': 1}, axis=1) | |||
result = groupedT.describe() | |||
expected = self.tsframe.describe().T | |||
expected.index = pd.MultiIndex([[0, 0, 1, 1], expected.index], | |||
[range(4), range(len(expected.index))]) | |||
# GH 17464 - Remove duplicate MultiIndex levels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need this comment here
pandas/tests/groupby/test_groupby.py
Outdated
@@ -385,6 +385,83 @@ def test_attr_wrapper(self): | |||
# make sure raises error | |||
pytest.raises(AttributeError, getattr, grouped, 'foo') | |||
|
|||
def test_series_describe_multikey(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests were moved, revert this file
pandas/tests/indexes/test_multi.py
Outdated
# Make sure that a MultiIndex with duplicate levels throws a ValueError | ||
with pytest.raises(ValueError): | ||
ind = pd.MultiIndex([['A'] * 10, range(10)], [[0] * 10, range(10)]) | ||
# And that using set_levels with duplicate levels fails |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank line
@jreback Okay, addressed those issues |
thanks @cmazzullo |
No problem |
git diff upstream/master -u -- "*.py" | flake8 --diff