-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Regression in .loc accepting a boolean Index as an indexer #17738
Conversation
Codecov Report
@@ Coverage Diff @@
## master #17738 +/- ##
==========================================
- Coverage 91.22% 91.2% -0.02%
==========================================
Files 163 163
Lines 49813 49813
==========================================
- Hits 45442 45433 -9
- Misses 4371 4380 +9
Continue to review full report at Codecov.
|
@jreback This is not a regression bug fix, but an backwards incompatible API change. Not saying that we shouldn't do this, but that we have to take a bit more care to think about it (and possibly deprecate, although given that the previous behaviour of using it as labels was broken in the latest release, maybe not needed), and certainly note it in the whatsnew as such. |
Actually, it was only broken in master, not in the latest release. So to clarify what I mean, before the boolean values were interpreted as labels:
which is not useful, so I agree changing this is a good idea. |
@jorisvandenbossche oh I though this was broken in 0.20.3; its a regression none the less. your example is unambiguous. an Index of booleans is de-facto the same as a boolean ndarray, it has no other interpretation that as a boolean indexer (a boolean I'll remove the whatsnew note then. |
I think some refactoring broke this (and was not explicity tested). |
@jorisvandenbossche actually looking at your example [53] again, I think that is completely broken and not expected behavior at all. A boolean indexer by-definition is not label based. I guess could add a note about that. |
@jorisvandenbossche actually, that is even moar broken [53]. Its interpreting the bools as labels, again completely wrong. |
As I said, I agree the behaviour was kind of 'broken' (or at least strange) and good to fix, but still I wouldn't call this a regression fix (it has been like that for a long time). Although I can agree calling it a 'bug fix' for the integer case. But for the boolean index case it is really a change in behaviour, because there the original behaviour made some sense:
|
@jorisvandenbossche ok I'll add a note. We don't have first class support for boolean index anyhow (esp when its duplicated its really weird). |
But we could also keep the behaviour (so not use boolean indexing when the index (inferred) type is bool) |
* 'master' of github.com:pandas-dev/pandas: (188 commits) Separate out _convert_datetime_to_tsobject (pandas-dev#17715) DOC: remove whatsnew note for xref pandas-dev#17131 BUG: Regression in .loc accepting a boolean Index as an indexer (pandas-dev#17738) DEPR: Deprecate cdate_range and merge into bdate_range (pandas-dev#17691) CLN: replace %s syntax with .format in pandas.core: categorical, common, config, config_init (pandas-dev#17735) Fixed the memory usage explanation of categorical in gotchas from O(nm) to O(n+m) (pandas-dev#17736) TST: add backward compat for offset testing for pickles (pandas-dev#17733) remove unused time conversion funcs (pandas-dev#17711) DEPR: Deprecate convert parameter in take (pandas-dev#17352) BUG:Time Grouper bug fix when applied for list groupers (pandas-dev#17587) BUG: Fix some PeriodIndex resampling issues (pandas-dev#16153) BUG: Fix unexpected sort in groupby (pandas-dev#17621) DOC: Fixed typo in documentation for 'pandas.DataFrame.replace' (pandas-dev#17731) BUG: Fix series rename called with str altering name rather index (GH17407) (pandas-dev#17654) DOC: Add examples for MultiIndex.get_locs + cleanups (pandas-dev#17675) Doc improvements for IntervalIndex and Interval (pandas-dev#17714) BUG: DataFrame sort_values and multiple "by" columns fails to order NaT correctly Last of the timezones funcs (pandas-dev#17669) Add missing file to _pyxfiles, delete commented-out (pandas-dev#17712) update imports of DateParseError, remove unused imports from tslib (pandas-dev#17713) ...
this is very awkward, an Index is virtually a ndarray except works with all types, for it not to work 'as-expected' in indexing would be odd (and enabling instead for a special case of selecting out a non-supported boolean Index with True/False values) |
closes #17131