-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API/ENH: relabel method #15104
API/ENH: relabel method #15104
Conversation
how would this work with groupby? (just thinking out loud) |
Setup: df = pd.DataFrame({'key': [1, 1, 2], 'value': [1, 2, 3]})
gb = df.groupby('key') I can't think of a meaningful answer for what In [17]: gb.sum()
Out[17]:
value
key
1 3
2 3
In [18]: gb.sum().relabel(columns=['newvalue'])
Out[18]:
newvalue
key
1 3
2 3 |
Labels to construct new axis from - number of labels | ||
must match the length of the existing axis | ||
copy : boolean, default True | ||
Also copy underlying data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we have this flag (copy
) on .rename
as well. I think these are confusing, though not sure what a better API is for these. (and copy=True, inplace=True
is pretty much meaningless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, not sure - I agree it's confusing. I'd be tempted to deprecate copy
altogether, though if you know what you're doing I suppose it's useful.
|
||
See Also | ||
-------- | ||
pandas.NDFrame.rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prob best to add pandas.Series.rename, pandas.DataFrame.rename, pandas.Panel.rename
here instead
Current coverage is 84.75% (diff: 93.75%)@@ master #15104 diff @@
==========================================
Files 145 145
Lines 51220 51263 +43
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43415 43448 +33
- Misses 7805 7815 +10
Partials 0 0
|
@@ -1586,6 +1586,23 @@ If you create an index yourself, you can just assign it to the ``index`` field: | |||
|
|||
data.index = index | |||
|
|||
.. versionadded:: 0.20.0 | |||
|
|||
Alternatively, the :meth:`~DataFrame.relabel` can be used to assign new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these underlined? or is that the diff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the diff - no clue why it looks like that though
what does SQL, Spark, R call this rename/relabel API? not that we need to follow, but consistency is not a bad thing. |
dplyr and SQL would both do this through the
In base R it also looks like you can do label assignment via the
|
I suppose #12392 also needs consideration. From the discussion there, there was some preference towards moving towards a |
Yes, we need to take #12392 also in consideration. For now, you only implemented the new feature (renaming all columns using a list-like) in
me as well, but I think there was a compromise in #12392 to have both possibilities at the same time. But let's leave that discussion over there. #12392 is something I would like to see happen for 0.20/1.0, so I can try to take a new look at that this weekend. |
I view this different than @jorisvandenbossche This is like |
Regarding the ibis also calls this relabel: http://docs.ibis-project.org/generated/ibis.expr.api.TableExpr.relabel.html#ibis.expr.api.TableExpr.relabel cc @pandas-dev/pandas-core |
That is certainly also a possibility, but then I would keep |
My thinking was that
|
I'm in favor of relabel gaining the dict and function behaviors of rename. I think relabel is the better name for changing the row or column labels. That leaves rename for changing the index and column names.
…_____________________________
From: chris-b1 <notifications@github.com>
Sent: Thursday, January 12, 2017 17:44
Subject: Re: [pandas-dev/pandas] API/ENH: relabel method (#15104)
To: pandas-dev/pandas <pandas@noreply.github.com>
Cc: Tom Augspurger <thomas-augspurger@uiowa.edu>, Manual <manual@noreply.github.com>
Or were there other reasons to introduce a relabel method?
My thinking was that rename is already too overloaded and can run into corner cases with a list-like #14829 (comment), but can't really be changed for back-compat.
set_labels wouldn't be bad either, although somewhat confusingly we have a set_axis that does the same thing, but inplace only.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The name "labels" is nicely unambiguous about referring to BUT I am more inclined toward the solution #14829, which enables It's also worth noting that we do have an existing
CCing @MaximilianR in case he has ideas. |
#14636 here is a proposal to change |
I'm uncertain about the api here, and this specific change isn't solving a significant problem so closing for now. Worth re-evaluating after/ as part of #12392. |
Alt to #15029
git diff upstream/master | flake8 --diff
small example