-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: New short indexer for operating on values #14976
Comments
why do you think this should work this way? The point is to align operations on the index by default |
This is basically the same as @shoyer's point in #10000 (comment) right? IIRC the current behavior of I think expecting df[['a', 'b']] * df['c'] to return In [20]: df[['a', 'b']].mul(df['c'], axis=0)
Out[20]:
a b
0 1.726545 0.391649
1 -2.189975 -1.825123
2 -0.098067 0.015623 is perfectly reasonable. That said, this would be a big API change, with no clear way of deprecation. |
In my experience, the best way to write such arithmetic currently is something like I think this would be reasonable behavior to change for pandas 2.0 but probably not before. I'm not excited about the proposal here, which feels like a work-around for fundamentally broken broadcasting behavior rather than a fix of the root cause. |
@shoyer if you want to create an issue for pandas 2 would be great. closing this one as no-action in pandas 1.0 |
See wesm/pandas2#30 |
Thanks for the discussion here.
My proposal was afaik a minor fix for a common problem, which people like me have now. But, I've learned, that even this addition would mean a lot of trouble/confusion to others. So, I agree with @shoyer and @jreback that this issue is reasonable, but also too profound. |
First of all, if I missed a point, please feel free to comment.
Using arithmetic operations on pd.DataFrames is sometimes a mouthful. Take the following example, where columns
a
andb
should be multiplied by the columnc
:Apparently this doesn't work as expected. Instead one has to use either
pd.Dataframe.mul()
, which brings up poor legibility, orpd.Dataframe.values
, which yields long lines and therefore also results in poor legibility:Surely, the last call in this example returns a numpy array, but in my case thats the only thing I'm interested in, since I'm rewrapping my data at a later stage.
I'm proposing a new short indexer for operating on values, sth like:
Or even more sophisticated:
Btw the same goes for all other arithmetic operators.
The text was updated successfully, but these errors were encountered: