-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New behaviour regarding inplace values setting with iloc #47381
Comments
Hi @glemaitre, thanks for the report.
This should show the warning, correct? I can neither reproduce on main nor with our nightly build. Could you share a link to one of your builds? |
Yes in pandas 1.5.0.dev0, it is raising the warning as in this build (you can check the last warning in the stack). What I don't get from the example above is that the behaviour does not seem to change between 1.4.2 and 1.5.0.dev0 apart from the additional warning. |
I think this warning is shown because of the dtype missmatch in your test (e.g. float16 and float32). But will have to look more closely later |
OK I will check the other occurrence of the warning as well. So the documentation example should trigger a change of data type to illustrate the change of behaviour then? I mean, it should look like: values = np.arange(4).reshape(2, 2)
df = pd.DataFrame(values)
ser = df[0]
df.iloc[:, 0] = np.array([10, 11]).astype(np.int16) then the output of |
The warning is about a change in the future. We want to operate inplace there, hence we would update ser too. If you want to avoid that, you should use the regular setitem. Same dtypes are already operating inplace, so you do not get the warning. The deprecation was added in #45333 |
It seems like when using .iloc on an empty dataframe, you get a similar warning which could maybe be avoided? import numpy as np
import pandas as pd
arr = np.arange(6).reshape(3, 2).astype(np.float64)
df_orig = pd.DataFrame(arr, columns=['a', 'b'])
df_new = df_orig.iloc[[], :].copy()
df_new.iloc[:, 0] = np.array([1, 2, 4], dtype=np.float64) I created #47433 to improve the whats_new entry. |
Indeed, in this case, there is an "enlargement" of the dataframe when setting the values, and that can never be done inplace, and so we should avoid the warning in that case. |
Sounds great! |
Indeed. |
In scikit-learn, when testing the pandas nightly build, we got a
FutureWarning
related to the following deprecation:https://pandas.pydata.org/docs/dev/whatsnew/v1.5.0.html#try-operating-inplace-when-setting-values-with-loc-and-iloc
We have 2 related questions regarding this deprecation. First, it seems that we cannot reproduce the "Old Behaviour" with the latest available release:
Is there a reason for not spotting the behaviour shown in the documentation?
The second question (actually it is more a comment to open a discussion) concerns the proposed fix.
It is proposed to use
df[df.columns[i]] = newvals
instead of thedf.iloc[:, i] = newvals
.I personally find this way a bit counterintuitive since the
SettingWithCopyWarning
proposes to change todf.loc[rows, cols]
instead ofdf[cols][rows]
to get the inplace behaviour.If we consider that both approaches intend for an inplace change, the patterns used for "by position" (i.e.
.iloc
) or "by label" (i.e..loc
) are really different.The text was updated successfully, but these errors were encountered: