-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG fixes tuple agg issue 18079 #18354
Conversation
Codecov Report
@@ Coverage Diff @@
## master #18354 +/- ##
=======================================
Coverage 91.33% 91.33%
=======================================
Files 163 163
Lines 49801 49801
=======================================
Hits 45487 45487
Misses 4314 4314
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.21.1.txt
Outdated
@@ -62,6 +62,7 @@ Bug Fixes | |||
- Bug in ``pd.Series.rolling.skew()`` and ``rolling.kurt()`` with all equal values has floating issue (:issue:`18044`) | |||
- Bug in ``pd.DataFrameGroupBy.count()`` when counting over a datetimelike column (:issue:`13393`) | |||
- Bug in ``pd.concat`` when empty and non-empty DataFrames or Series are concatenated (:issue:`18178` :issue:`18187`) | |||
- Bug in :class:`NDFrameGroupBy` fixes ValueError: no results error when grouping by a single column and aggregating with a tuple (:issue:`18079`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just say something like "Bug when grouping by a single column and aggregating with a class like list
or tuple
" . I don't recall if NDFRameGroupBy is in the public API.
pandas/tests/groupby/test_groupby.py
Outdated
assert_frame_equal(result, expected) | ||
|
||
result = df.groupby('A')['C'].aggregate(tuple) | ||
expected = pd.Series([(1, 1, 1), (3, 4, 4)], index=[1, 3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to pass name='C'
here.
@TomAugspurger's comment on whats new prompted me to add a test for This works This doesn't. Should I submit a separate issue for this or investigate and fix this new error as part of this PR? Click to see full traceback---------------------------------------------------------------------------Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): |
pandas/tests/groupby/test_groupby.py
Outdated
# Issue #18079 | ||
df = pd.DataFrame({'A': [1, 1, 1, 3, 3, 3], | ||
'B': [1, 1, 1, 4, 4, 4], 'C': [1, 1, 1, 3, 4, 4]}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to test_aggregate
can you parametrize with list-likes here (e.g. tuple/list,np.array,Series)
some of these might have different output (Series) but let's see.
you may have to have a tests with the parameterized ones and another one with 'other' things, e.g.
def f(x):
return tuple(x)
which work now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback, @jreback! I'm hoping to tackle the list bug and these additional tests later this week.
dc18825
to
5fa71e1
Compare
Line 2303 in 4fce784
I was a little leery doing this, but all tests passed. Note: Both np.array and pd.Series still error out with both scenarios (grouping by one column vs multiple columns), but with different error descriptions. That said, is it possible to proceed with this PR since both tuple and list now behave the same way with single and multiple column groupbys?
This might close #4293 as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small changes, otherwise lgtm. ping on green.
doc/source/whatsnew/v0.21.1.txt
Outdated
@@ -103,7 +103,7 @@ Groupby/Resample/Rolling | |||
- Bug in ``DataFrame.resample(...).apply(...)`` when there is a callable that returns different columns (:issue:`15169`) | |||
- Bug in ``DataFrame.resample(...)`` when there is a time change (DST) and resampling frequecy is 12h or higher (:issue:`15549`) | |||
- Bug in ``pd.DataFrameGroupBy.count()`` when counting over a datetimelike column (:issue:`13393`) | |||
- | |||
- Bug when grouping by a single column and aggregating with a class like`list` or `tuple` (:issue:`18079`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use double-backticks around list
and tuple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this to 0.22 as well. (its not that I don't regard this as a bug fix, but this is something touching an important piece of code and want this to live in master for a bit).
Thanks @jreback! |
one more rebase :> |
2be5881
to
6639d48
Compare
Hello @bobhaffner! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on November 26, 2017 at 16:18 Hours UTC |
Hi @jreback I had conflicts (whatsnew) with the rebase, but I believe Its corrected now. That said, I'm puzzled by the pep8speaks comment regarding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like some extra files got added during the rebase. pls remove them. and push again.
doc/source/whatsnew/v0.21.1.txt.orig
Outdated
|
||
.. _whatsnew_0211.bug_fixes: | ||
|
||
Bug Fixes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
grp.patch
Outdated
if isinstance(func_or_funcs, compat.string_types): | ||
return getattr(self, func_or_funcs)(*args, **kwargs) | ||
|
||
- if hasattr(func_or_funcs, '__iter__'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
test_agg.py
Outdated
return list(x) | ||
|
||
#df = pd.DataFrame({'A' : [1, 1, 3], 'B' : [1, 2, 4]}) | ||
#result = df.groupby('A').aggregate(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
Yikes, my mistake. thanks Jeff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple of more items to remove
grp_test.patch
Outdated
@@ -2725,3 +2725,12 @@ def _check_groupby(df, result, keys, field, f=lambda x: x.sum()): | ||
expected = f(df.groupby(tups)[field]) | ||
for k, v in compat.iteritems(expected): | ||
assert (result[k] == v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
patch needs removal
doc/source/whatsnew/v0.21.1.txt
Outdated
@@ -147,4 +147,4 @@ Other | |||
^^^^^ | |||
|
|||
- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you revert this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI Jeff, I'm afraid my lack of git experience is showing as this file has been a bit of a thorn in my side. I'm concerned that my attempts to fix my mess will just result in wasting more of your time. Would you mind sharing the syntax necessary to revert (or reset?) this file to the proper commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure ill fix this upl
should be good to go. ping on green. |
Thanks Jeff! You're a lifesaver! |
thanks @bobhaffner |
Many thanks for the help and the patience, @jreback . Same to you @TomAugspurger |
tuple
as an argument #18079git diff upstream/master -u -- "*.py" | flake8 --diff