-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Use total_tie_count to normalize dense ranking #18297
BUG: Use total_tie_count to normalize dense ranking #18297
Conversation
As reported in #18296, for a `Series` with repeated values, `Series.rank(pct=True, method='dense').max()` may not be `<=1` as expected. This is due to the division of the ranks by the total number of elements in the `Series`, instead of the maximum rank assigned. Here we update the calculation.
I've confirmed that the updated code gives the expected result from the example in #18296. |
Codecov Report
@@ Coverage Diff @@
## master #18297 +/- ##
==========================================
- Coverage 91.4% 91.38% -0.02%
==========================================
Files 164 164
Lines 49880 49880
==========================================
- Hits 45592 45583 -9
- Misses 4288 4297 +9
Continue to review full report at Codecov.
|
how about some tests |
see #15630 (and linked PR soln but has comments). |
@@ -117,7 +117,7 @@ Reshaping | |||
Numeric | |||
^^^^^^^ | |||
|
|||
- | |||
- Fixed incorrect maximum :func:`Series.rank` percentile when using the `dense` method with repeated values (:issue:`18296`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move to 0.23
closing as stale. ping if you want to update. |
git diff upstream/master -u -- "*.py" | flake8 --diff
As reported in #18296, for a
Series
with repeated values,Series.rank(pct=True, method='dense').max()
may not be<=1
as expected.This is due to the division of the ranks by the total number of elements in the
Series
, instead of the maximum rank assigned. Here we update the calculation.