Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Wrong generalization about slice indexing #57277

Closed
1 task done
memeplex opened this issue Feb 6, 2024 · 2 comments · Fixed by #58017
Closed
1 task done

DOC: Wrong generalization about slice indexing #57277

memeplex opened this issue Feb 6, 2024 · 2 comments · Fixed by #58017
Assignees
Labels
Docs Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@memeplex
Copy link

memeplex commented Feb 6, 2024

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/pandas-docs/version/2.0.2/user_guide/indexing.html#slicing-ranges

Documentation problem

With Series, the syntax works exactly as with an ndarray, returning a slice of the values and the corresponding labels

And then shows examples of slice indexing df[i:j] where i and j are integers, and all of them behave as df.iloc[i:j] (that is, "exactly as with an ndarray").

But there is an exception (which I can't make sense of and fortunately is deprecated):

> pd.Series([1, 2, 3], [3., 2., 1.])[:3]

3.0    1
dtype: int64

So, when the index is float, integer slices behave loc-like. Please document this exception in a callout because it's dangerous to assume that the iloc-like behavior is general.

BTW, could someone give a rationale for this behavior? Is there a general rule that ends up in this weird situation, like "if index and slice are int -> iloc, if not then if slice is of the type of the index -> loc else if slice is int -> iloc else fail", and then an int slice is taken to be of the type of a float index?

Suggested fix for documentation

Document this exception in a callout because it's dangerous to assume that the iloc-like behavior is general.

@memeplex memeplex added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 6, 2024
@memeplex
Copy link
Author

memeplex commented Feb 6, 2024

BTW, could someone give a rationale for this behavior?

I guess the convoluted history that #49612 tells explains it.

  1. Initially there was the intention to make all int-slice indexing label based but the deprecations weren't sufficient and at that point doing so would have broken many things.
  2. So it was decided that it was ok if int-slice indexing was consistent per se (always position based) despite being somewhat inconsistent wrt to other types of indexing (s[i:j] iloc-like vs s[i] or s[[i,j]] loc-like).
  3. But float indexes were/are an exception for some historical reason (out of curiosity, which one?). So this case was deprecated.

That said, I believe that the documentation should be explicit about this exception until the deprecation is actually enforced.

@rhshadrach rhshadrach added Indexing Related to indexing on series/frames, not to indexes themselves and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 7, 2024
@shriyakalakata
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants