Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Groupby with matching column and index name emits spurious warning #17383

Closed
TomAugspurger opened this issue Aug 30, 2017 · 4 comments · Fixed by #17843
Closed

Groupby with matching column and index name emits spurious warning #17383

TomAugspurger opened this issue Aug 30, 2017 · 4 comments · Fixed by #17843

Comments

@TomAugspurger
Copy link
Contributor

xref #14432

Which of these should raise FutureWarnings? I think the idea was to use pd.Grouper(name) to disambiguate? In that case In [21] should not raise a warning?

In [17]: df = pd.DataFrame({"A": [1] * 5 + [2] * 5, "B": ['a', 'b'] * 5, 'C': range(10)}, index=pd.Index(range(10), name='A'))

In [19]: _ = df.groupby('A').mean()
/Users/taugspurger/.virtualenvs/pandas-dev/bin/ipython:1: FutureWarning: 'A' is both a column name and an index level.
Defaulting to column but this will raise an ambiguity error in a future version
  #!/Users/taugspurger/Envs/pandas-dev/bin/python3.6

In [20]: _ = df.groupby(['A']).mean()
/Users/taugspurger/.virtualenvs/pandas-dev/bin/ipython:1: FutureWarning: 'A' is both a column name and an index level.
Defaulting to column but this will raise an ambiguity error in a future version
  #!/Users/taugspurger/Envs/pandas-dev/bin/python3.6

In [21]: _ = df.groupby(pd.Grouper('A')).mean()
/Users/taugspurger/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/groupby.py:1699: FutureWarning: 'A' is both a column name and an index level.
Defaulting to column but this will raise an ambiguity error in a future version
  return klass(obj, by, **kwds)

In [22]: _ = df.groupby([pd.Grouper('A')]).mean()

cc @jmmease

@jonmmease
Copy link
Contributor

Yeah, I think [21] shouldn't raise a warning

@jonmmease
Copy link
Contributor

I'll try to take a look at this soon. Any reason why the df.groupby(grp_expr) operation shouldn't just always go down the df.groupby([grp_expr]) path?

I'd like to see if we could just always turn scalar by args into single element lists and get rid of the non-list logic. Thoughts?

@jreback
Copy link
Contributor

jreback commented Sep 12, 2017

scalar and list of a single arg should be the same
separate pr for that would be fine

@jonmmease
Copy link
Contributor

@jreback @TomAugspurger See #17843
FYI, I didn't end up needing to make the change discussed above (treating all groupers as lists)

@jreback jreback modified the milestones: Next Major Release, 0.21.0 Oct 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants