Can't slice float indices, but can slice integer indices? #7501

mrjbq7 · 2014-06-18T20:26:40Z

Using pandas 0.14, the slicing changed strangely. I can slice rows using start/end with an integer index, but not a float index:

In [18]: df = pandas.DataFrame(np.random.randn(10, 5), index=np.arange(10, 20))

In [19]: df
Out[19]: 
           0         1         2         3         4
10 -1.878123 -0.581537  0.189536  0.173014  0.132059
11  1.229246 -0.988689  0.632404  0.939126 -0.186367
12  0.376735  0.329723  1.480293 -0.209164  0.080897
13  0.461558  0.303541 -0.669196 -1.032077  1.634512
14 -0.972455  0.657357 -0.566609  0.154165 -0.561543
15 -2.502244  1.022540 -1.019376 -0.934582  1.751852
16 -1.875567  0.504288 -0.524922  0.048277 -1.587904
17  0.636652  0.441224 -1.391552  0.650876  0.374673
18 -1.503102  0.822411  1.776667 -0.879583  1.035291
19 -0.620467  0.319855 -0.779280 -0.168827  0.502470

In [20]: df[3:5]
Out[20]: 
           0         1         2         3         4
13  0.461558  0.303541 -0.669196 -1.032077  1.634512
14 -0.972455  0.657357 -0.566609  0.154165 -0.561543

In [21]: df.index = [float(x) for x in df.index]

In [22]: df
Out[22]: 
           0         1         2         3         4
10 -1.878123 -0.581537  0.189536  0.173014  0.132059
11  1.229246 -0.988689  0.632404  0.939126 -0.186367
12  0.376735  0.329723  1.480293 -0.209164  0.080897
13  0.461558  0.303541 -0.669196 -1.032077  1.634512
14 -0.972455  0.657357 -0.566609  0.154165 -0.561543
15 -2.502244  1.022540 -1.019376 -0.934582  1.751852
16 -1.875567  0.504288 -0.524922  0.048277 -1.587904
17  0.636652  0.441224 -1.391552  0.650876  0.374673
18 -1.503102  0.822411  1.776667 -0.879583  1.035291
19 -0.620467  0.319855 -0.779280 -0.168827  0.502470

In [23]: df[3:5]
Out[23]: 
Empty DataFrame
Columns: [0, 1, 2, 3, 4]
Index: []

In [24]: df.index
Out[24]: Float64Index([10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0], dtype='float64')

In [25]: pandas.__version__
Out[25]: '0.14.0'

Was this an intentional change, or a bug?

hayd · 2014-06-18T20:40:56Z

Is this supposed to act like loc or ix, or something else?

Behaviour of In [20]: df[3:5] seems so wrong/unstable!

cpcloud · 2014-06-18T20:41:14Z

[] is supposed to act like ix IIRC

cpcloud · 2014-06-18T20:42:12Z

In which case, this is a bug.

mrjbq7 · 2014-06-18T20:42:47Z

If you're suggesting slicing by rows is undesirable, what is the new way to do that?

cpcloud · 2014-06-18T20:43:14Z

You should use loc for label-based indexing and use iloc for position based indexing

jreback · 2014-06-18T20:43:56Z

@mrjbq7 You are slicing the columns

df.loc[10:12] works just fine, or df.iloc[3:5] as well

or df.loc[10.0:12.0]

mrjbq7 · 2014-06-18T20:44:36Z

@jreback the In[20] shows the results of slicing rows, however.

cpcloud · 2014-06-18T20:44:42Z

@jreback he's slicing the rows

mrjbq7 · 2014-06-18T20:45:11Z

Okay, I can use iloc, however the behavior still seems buggy and inconsistent! :)

cpcloud · 2014-06-18T20:45:56Z

it's definitely a bug, probably introduced by the small (ish) refactoring of Float64Index to use the float64 dtype instead of object.

jreback · 2014-06-18T20:46:02Z

In[20] has always been that way, its a fallback indexing

jreback · 2014-06-18T20:46:19Z

though their is a bug their somewhere, @cpcloud ?

cpcloud · 2014-06-18T20:46:36Z

on it 😄

mrjbq7 · 2014-06-18T20:52:20Z

P.S. you guys are awesomely responsive. Thanks.

jorisvandenbossche · 2014-06-18T20:58:11Z

So slicing in [] as df[x:y] is always slicing by location, also if the integer labels would overlap? (in other words, df[x:y] is actually equivalent with df.iloc[x:y]?)
Is this somewhere mentioned in the docs? As I don't see it in this section: http://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges, and it is indeed a bit counterintuitive, as with df.ix[x:y] it tries first label-based (so in above example it gives an empty frame).

cpcloud · 2014-06-18T20:58:53Z

no if the labels are there, then it will use loc, otherwise it will try using iloc

jorisvandenbossche · 2014-06-18T20:59:50Z

That does not seem to be the case:

In [7]: df = pandas.DataFrame(np.random.randn(10, 5), index=np.arange(5, 15))

In [8]: df
Out[8]:
           0         1         2         3         4
5   1.027523 -0.481625  0.525546 -0.604405 -1.644525
6  -0.568643  0.385232 -0.661878  0.373214  2.326299
7  -1.163296 -0.118817 -1.528926 -1.937901  0.659142
8   1.138747  0.480652 -1.105340 -0.151181 -0.100053
9  -0.065683 -0.755676  0.578010 -0.350439  0.446478
10  0.035460  1.164672  0.489051  0.289033  0.309896
11  1.250149  0.032059  1.687558 -1.313212  0.645179
12 -1.393927 -0.903836 -2.174578 -0.206523 -1.483739
13  1.313273  1.569998 -0.326552  0.955845  0.138290
14 -0.629166  0.861509 -0.057021  1.336045  0.207536

In [9]: df[5:7]
Out[9]:
           0         1         2         3         4
10  0.035460  1.164672  0.489051  0.289033  0.309896
11  1.250149  0.032059  1.687558 -1.313212  0.645179

cpcloud · 2014-06-18T21:00:31Z

huh. guess im wrong then, imho this style of indexing should be banned for life

mrjbq7 · 2014-06-18T21:00:40Z

It's old syntax, which has worked in my application for several pandas versions and I only noticed just now that it was strangely not working.

cpcloud · 2014-06-18T21:01:06Z

this was a change in 0.14 ... i refactored some of the "guessing" code and didn't catch this case

jreback · 2014-06-18T21:01:18Z

[x:y] tries to convience slice on the rows (its basically .ix[x:y], but pretty odd if you ask me
but since it has been their a long time we left it

jorisvandenbossche · 2014-06-18T21:04:07Z

So the bug is that the FloatIndex is not doing this location based slicing but tries label-based? (which is more logical, but inconsistent with the other indexers)

@jreback it's not like .ix[x:y] as ix will first try label based and fall back to integer location, while df[x:y] only tries integer location:

In [14]: df[3:5]
Out[14]:
           0         1         2         3         4
13  0.098103 -0.290480  0.716710 -0.533959 -0.890271
14 -0.738622  0.325792  1.106741  0.442422 -1.087715

In [15]: df.ix[3:5]
Out[15]:
Empty DataFrame
Columns: [0, 1, 2, 3, 4]
Index: []

cpcloud · 2014-06-18T21:04:52Z

@jorisvandenbossche i'm glad you found this, except that now we have 4th style of indexing to support .... :)

jreback · 2014-06-18T21:06:10Z

hmm, by definition on Float64Index this cannot do integer based, and MUST be label based. This whole 4th type is very odd if you ask me.

cpcloud · 2014-06-18T21:07:38Z

so we have

ix default to labels
mystery_meat i.e., [] default to pos
loc labels only
iloc pos only

there's a strangely beautiful symmetry to this mess

jreback · 2016-04-19T12:20:25Z

I believe most of the oddness was fixed in 0.18.0 for the float slicing fixup. so going to revisit this.

jorisvandenbossche · 2016-04-19T12:36:08Z

@jreback What do you mean with 'revisit'? I don't think anything has changed to the original issue here ([] being label based for FloatIndex, while location based for IntIndex)

jreback · 2016-04-19T12:37:55Z

no, some of the 'other' issues are fixed (IOW, the slicing is all now consistent), e.g.

In [23]: s = Series(np.arange(5), index=np.arange(5) * 2.5, dtype=np.int64)

In [24]: s
Out[24]: 
0.0     0
2.5     1
5.0     2
7.5     3
10.0    4
dtype: int64

In [25]: s[2:5]
Out[25]: 
2.5    1
5.0    2
dtype: int64

In [26]: s[2.0:5.0]
Out[26]: 
2.5    1
5.0    2
dtype: int64

jreback · 2017-03-23T13:50:52Z

we should prob close this issue and open a new clarifying issue about what the problems are with FloatIndex slicing.

jreback · 2017-09-23T21:12:03Z

can someone parse this issue and see if we should open an issue w.r.t. float slicing?

jorisvandenbossche · 2017-09-25T19:21:41Z

The original reported 'issue' is still present (or better: 'debatable behaviour')

To summarize in a specific way (disregarding the .ix part of the above discussion, as that is deprecated anyway): for all index types, using integers in [] (__getitem__) is positional (like iloc), except for Float64Index, making this a special case.
So also for an integer index, [] is positional and not label based.

I think it would be nice to make int index and float index consistent for []. Making Int64Index label based would be a change with a lot of impact, changing Float64Index to do positional (when using integers, when using floats it stays label-based) would be easier I think.

TomAugspurger · 2019-12-30T14:01:18Z

Pushing this off 1.0.

MarcoGorelli · 2020-01-27T15:58:42Z

can someone parse this issue and see if we should open an issue w.r.t. float slicing?

Looks like it's been opened here #31344

jbrockmendel · 2020-04-15T22:12:59Z

Reading over this thread I was briefly optimistic "with ix gone this might be easier!" but nope

I think it would be nice to make int index and float index consistent for []. Making Int64Index label based would be a change with a lot of impact, changing Float64Index to do positional (when using integers, when using floats it stays label-based) would be easier I think.

I agree with @jorisvandenbossche here.

I'd also be on board with a "nuke it from space" option to deprecate the ambiguous behaviors

cpcloud added Bug labels Jun 18, 2014

cpcloud added this to the 0.14.1 milestone Jun 18, 2014

cpcloud self-assigned this Jun 18, 2014

jreback removed this from the 0.16.0 milestone Mar 6, 2015

jreback modified the milestones: 0.18.2, Next Major Release Apr 19, 2016

jreback unassigned cpcloud Apr 19, 2016

jorisvandenbossche modified the milestones: 0.20.0, 0.19.0 Aug 15, 2016

jreback modified the milestones: 0.20.0, 0.21.0 Mar 28, 2017

jreback modified the milestones: 0.21.0, 1.0 Oct 2, 2017

TomAugspurger modified the milestones: 1.0, Contributions Welcome Dec 30, 2019

MarcoGorelli mentioned this issue Jan 27, 2020

Why [] operation on float index doesn't work the same way as int index? #31344

Closed

jbrockmendel mentioned this issue May 15, 2020

DEPR: deprecate Index.__getitem__ with float key #34191

Closed

This was referenced Jan 8, 2022

API: series_with_int64index[i:j] should be label-based #45162

Closed

DEPR: inconsistent series[i:j] slicing with Int64Index GH#45162 #45324

Merged

jreback modified the milestones: Contributions Welcome, 1.5 Jan 13, 2022

jreback closed this as completed in #45324 Jan 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't slice float indices, but can slice integer indices? #7501

Can't slice float indices, but can slice integer indices? #7501

mrjbq7 commented Jun 18, 2014

hayd commented Jun 18, 2014

cpcloud commented Jun 18, 2014

cpcloud commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

cpcloud commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Jun 18, 2014

jreback commented Jun 18, 2014

cpcloud commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

jorisvandenbossche commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jorisvandenbossche commented Jun 18, 2014

cpcloud commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Jun 18, 2014

jorisvandenbossche commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Apr 19, 2016

jorisvandenbossche commented Apr 19, 2016

jreback commented Apr 19, 2016

jreback commented Mar 23, 2017

jreback commented Sep 23, 2017

jorisvandenbossche commented Sep 25, 2017

TomAugspurger commented Dec 30, 2019

MarcoGorelli commented Jan 27, 2020

jbrockmendel commented Apr 15, 2020

Can't slice float indices, but can slice integer indices? #7501

Can't slice float indices, but can slice integer indices? #7501

Comments

mrjbq7 commented Jun 18, 2014

hayd commented Jun 18, 2014

cpcloud commented Jun 18, 2014

cpcloud commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

cpcloud commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Jun 18, 2014

jreback commented Jun 18, 2014

cpcloud commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

jorisvandenbossche commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jorisvandenbossche commented Jun 18, 2014

cpcloud commented Jun 18, 2014

mrjbq7 commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Jun 18, 2014

jorisvandenbossche commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Jun 18, 2014

cpcloud commented Jun 18, 2014

jreback commented Apr 19, 2016

jorisvandenbossche commented Apr 19, 2016

jreback commented Apr 19, 2016

jreback commented Mar 23, 2017

jreback commented Sep 23, 2017

jorisvandenbossche commented Sep 25, 2017

TomAugspurger commented Dec 30, 2019

MarcoGorelli commented Jan 27, 2020

jbrockmendel commented Apr 15, 2020