Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: groupby resample different results with .agg() vs .mean() #37905

Merged
merged 15 commits into from
Dec 22, 2020

Conversation

jalmaguer
Copy link
Contributor

@jalmaguer jalmaguer commented Nov 17, 2020

@pep8speaks
Copy link

pep8speaks commented Nov 17, 2020

Hello @jalmaguer! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-12-21 05:42:45 UTC

@jreback jreback changed the title BUG: fix GH33548 BUG: groupby resample different results with .agg() vs .mean() Nov 17, 2020
doc/source/whatsnew/v1.2.0.rst Outdated Show resolved Hide resolved
pandas/core/groupby/grouper.py Outdated Show resolved Hide resolved
pandas/core/groupby/grouper.py Outdated Show resolved Hide resolved
@jreback jreback added Bug Groupby Resample resample method labels Nov 17, 2020
@jreback
Copy link
Contributor

jreback commented Nov 17, 2020

some CI Checks errors

@jalmaguer jalmaguer requested a review from jreback November 17, 2020 03:52
pandas/core/groupby/grouper.py Show resolved Hide resolved
pandas/core/groupby/grouper.py Outdated Show resolved Hide resolved
@jalmaguer jalmaguer requested a review from jreback November 17, 2020 15:00
@jalmaguer
Copy link
Contributor Author

@jreback I made some commits a few days ago since your last comments so I just wanted to bump this pull request so that it doesn't get lost and forgotten about.

pandas/core/groupby/grouper.py Outdated Show resolved Hide resolved
@jalmaguer jalmaguer requested a review from jreback November 24, 2020 18:36
pandas/core/groupby/grouper.py Outdated Show resolved Hide resolved
# pandas\core\groupby\grouper.py:348: error: Item "None" of
# "Optional[Any]" has no attribute "take" [union-attr]
ax = self._grouper.take(obj.index) # type: ignore[union-attr]
# Sometimes self._grouper will have been resorted while
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you track down exactly how this happens, this is a real problem, somthing is keeping state and now we are undoing it. Your fix is pretty clean, but this makes reasoning very about this very hard.

Copy link
Contributor Author

@jalmaguer jalmaguer Nov 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s very complicated and you have to dig through a stack trace that is 18 method calls deep to understand it but I believe it’s because self._grouper which is meant to be the TimeGrouper though in this case is actually just a sorted DatetimeIndex is always sorted and obj if you go up the stack trace you’ll see that it is coming from a SeriesGroupby._selected_obj which I’m guessing just doesn’t get sorted when it’s created. The case where self._grouper has not been resorted I believe only happens when the dates were already in order and don’t need to be resorted.

@jalmaguer jalmaguer requested a review from jreback November 29, 2020 23:47
@jreback
Copy link
Contributor

jreback commented Dec 4, 2020

can you merge master (been lots of changes in groupby recently)

@jalmaguer
Copy link
Contributor Author

can you merge master (been lots of changes in groupby recently)

@jreback Sure no problem. Just merged it. Looks like it should be good to go.

@jalmaguer
Copy link
Contributor Author

@jreback Just bumping this so it doesn't get forgotten

@jreback
Copy link
Contributor

jreback commented Dec 21, 2020

ok we had some patches which might have fixed this

can u merge master (and also test if your added tests pass w/o the patch)

@jalmaguer
Copy link
Contributor Author

@jreback Just merged master. Looks like my added tests do not pass without the patch.

@jreback jreback added this to the 1.2 milestone Dec 22, 2020
@jreback jreback merged commit fc1df2e into pandas-dev:master Dec 22, 2020
@jreback
Copy link
Contributor

jreback commented Dec 22, 2020

thanks @jalmaguer

@jreback
Copy link
Contributor

jreback commented Dec 22, 2020

@meeseeksdev backport 1.2.x

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Dec 22, 2020
jreback pushed a commit that referenced this pull request Dec 22, 2020
…() vs .mean() (#38633)

Co-authored-by: Jose <jalmague@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: groupby resample different results with .agg() vs .mean()
3 participants