-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
na_position doesn't work for sort_index() with MultiIndex #15845
Conversation
Codecov Report
@@ Coverage Diff @@
## master #15845 +/- ##
==========================================
- Coverage 90.97% 90.96% -0.02%
==========================================
Files 143 143
Lines 49442 49446 +4
==========================================
- Hits 44982 44977 -5
- Misses 4460 4469 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #15845 +/- ##
==========================================
- Coverage 90.79% 90.79% -0.01%
==========================================
Files 156 156
Lines 50534 50537 +3
==========================================
+ Hits 45883 45885 +2
- Misses 4651 4652 +1
Continue to review full report at Codecov.
|
@linebp I'd prefer if you did something like I illustrated here: #14784 (comment) Put these tests (and on Series as well) in |
I am not sure what I am missing, I tried to do as suggested. These are my steps figuring out what to do:
Where and how did I go wrong? I'll move the tests to a more appropriate location! ... and test the Series sorting too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs testing for Series (which won't work as you have changed the code)
pandas/core/frame.py
Outdated
@@ -3392,7 +3392,13 @@ def sort_index(self, axis=0, level=None, ascending=True, inplace=False, | |||
if not labels.is_lexsorted(): | |||
labels = MultiIndex.from_tuples(labels.values) | |||
|
|||
indexer = lexsort_indexer(labels.labels, orders=ascending, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I said, you can simply create a method on MultiIndex
, called _lexsort_indexer
(let's make it private), that returns the correct indexer.
pandas/core/frame.py
Outdated
from pandas.core.sorting import lexsort_indexer | ||
|
||
# make sure that the axis is lexsorted to start | ||
# if not we need to reconstruct to get the correct indexer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so leave the is_lexsorted
check in here; this is fixed in #15694 w/o this subtle things will break.
pandas/core/frame.py
Outdated
|
||
indexer = lexsort_indexer(labels.labels, orders=ascending, | ||
na_position=na_position) | ||
indexer = labels._lexsort_indexer(orders=ascending, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just do
indexer = lexsort_indexer(labels._get_labels_for_sorting(), orders=ascending,
na_position=na_position)
so IOW rename pandas.indexes.multi._lexsort_indexer
(your new routines) to _get_labels_for_sorting
pandas/core/series.py
Outdated
@@ -1763,8 +1763,8 @@ def sort_index(self, axis=0, level=None, ascending=True, inplace=False, | |||
new_index, indexer = index.sortlevel(level, ascending=ascending, | |||
sort_remaining=sort_remaining) | |||
elif isinstance(index, MultiIndex): | |||
from pandas.core.sorting import lexsort_indexer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the same change as above
pandas/indexes/multi.py
Outdated
# else: | ||
# mi = self | ||
|
||
keys = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to _get_labels_for_sorting()
add a doc-string.
rename keys
-> labels
so, add the Panels seems to have a function similar to If you use the |
@linebp this needs to be very targeted. don't touch Panel and don't worry about level. instead of a it is very easy to break things w/o very specific patches. |
@linebp maybe I wasn't clear. These ALL the existing code, except for
where so you are adding 1 function to MultiIndex, and changing 1 line in series & 1 line in frame.py thats it. |
I am still a bit confused about what to do, but this adds 1 function and only changes 1 line in series and frame each. No comments on the tests, so I let those stay as is. |
ok, pls rebase this as I pushed in a bunch of fixes for sorting (generally) |
I made an effort to rebase, how is it looking now? |
@linebp code looks good. pls add a whatsnew (bug fix). make sure you rebase on origin (there are still conflicts showing)
then push with |
It fails when
|
@linebp I pushed a fix to your branch. can you add the above as an additional tests. ping on green. |
closes pandas-dev#14784 Author: Line Pedersen <linebp@users.noreply.github.com> Closes pandas-dev#15845 from linebp/json_normalize_seperator and squashes the following commits: 66f809e [Line Pedersen] BUG GH14784 na_position doesn't work for sort_index() with MultiIndex
closes pandas-dev#14784 Author: Line Pedersen <linebp@users.noreply.github.com> Closes pandas-dev#15845 from linebp/json_normalize_seperator and squashes the following commits: 66f809e [Line Pedersen] BUG GH14784 na_position doesn't work for sort_index() with MultiIndex
git diff upstream/master --name-only -- '*.py' | flake8 --diff