New behaviour regarding inplace values setting with iloc #47381

glemaitre · 2022-06-16T07:57:28Z

In scikit-learn, when testing the pandas nightly build, we got a FutureWarning related to the following deprecation:

https://pandas.pydata.org/docs/dev/whatsnew/v1.5.0.html#try-operating-inplace-when-setting-values-with-loc-and-iloc

We have 2 related questions regarding this deprecation. First, it seems that we cannot reproduce the "Old Behaviour" with the latest available release:

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: pd.__version__
Out[3]: '1.4.2'

In [4]: values = np.arange(4).reshape(2, 2)
   ...: 
   ...: df = pd.DataFrame(values)
   ...: 
   ...: ser = df[0]

In [5]: df.iloc[:, 0] = np.array([10, 11])

In [6]: ser
Out[6]: 
0    10
1    11
Name: 0, dtype: int64

Is there a reason for not spotting the behaviour shown in the documentation?

The second question (actually it is more a comment to open a discussion) concerns the proposed fix.

It is proposed to use df[df.columns[i]] = newvals instead of the df.iloc[:, i] = newvals.
I personally find this way a bit counterintuitive since the SettingWithCopyWarning proposes to change to df.loc[rows, cols] instead of df[cols][rows] to get the inplace behaviour.

If we consider that both approaches intend for an inplace change, the patterns used for "by position" (i.e. .iloc) or "by label" (i.e. .loc) are really different.

The text was updated successfully, but these errors were encountered:

phofl · 2022-06-16T11:53:13Z

Hi @glemaitre,

thanks for the report.

values = np.arange(4).reshape(2, 2)
df = pd.DataFrame(values)
ser = df[0]
df.iloc[:, 0] = np.array([10, 11])

This should show the warning, correct? I can neither reproduce on main nor with our nightly build. Could you share a link to one of your builds?

glemaitre · 2022-06-16T12:20:50Z

Yes in pandas 1.5.0.dev0, it is raising the warning as in this build (you can check the last warning in the stack).

https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=43315&view=logs&j=dfe99b15-50db-5d7b-b1e9-4105c42527cf&t=ef785ae2-496b-5b02-9f0e-07a6c3ab3081&l=309511

What I don't get from the example above is that the behaviour does not seem to change between 1.4.2 and 1.5.0.dev0 apart from the additional warning.

phofl · 2022-06-16T12:30:37Z

I think this warning is shown because of the dtype missmatch in your test (e.g. float16 and float32). But will have to look more closely later

glemaitre · 2022-06-16T12:41:47Z

I think this warning is shown because of the dtype missmatch in your test (e.g. float16 and float32). But will have to look more closely later

OK I will check the other occurrence of the warning as well.

So the documentation example should trigger a change of data type to illustrate the change of behaviour then? I mean, it should look like:

values = np.arange(4).reshape(2, 2)
df = pd.DataFrame(values)
ser = df[0]
df.iloc[:, 0] = np.array([10, 11]).astype(np.int16)

then the output of ser is indeed the original series [0, 2].

phofl · 2022-06-16T18:53:14Z

The warning is about a change in the future. We want to operate inplace there, hence we would update ser too. If you want to avoid that, you should use the regular setitem.

Same dtypes are already operating inplace, so you do not get the warning.

The deprecation was added in #45333

lesteve · 2022-06-20T14:48:10Z

It seems like when using .iloc on an empty dataframe, you get a similar warning which could maybe be avoided?

import numpy as np
import pandas as pd

arr = np.arange(6).reshape(3, 2).astype(np.float64)
df_orig = pd.DataFrame(arr, columns=['a', 'b'])
df_new = df_orig.iloc[[], :].copy()
df_new.iloc[:, 0] = np.array([1, 2, 4], dtype=np.float64)

I created #47433 to improve the whats_new entry.

jorisvandenbossche · 2022-06-22T12:10:27Z

It seems like when using .iloc on an empty dataframe, you get a similar warning which could maybe be avoided?

Indeed, in this case, there is an "enlargement" of the dataframe when setting the values, and that can never be done inplace, and so we should avoid the warning in that case.

lesteve · 2022-06-22T15:48:57Z

Indeed, in this case, there is an "enlargement" of the dataframe when setting the values, and that can never be done inplace, and so we should avoid the warning in that case.

Sounds great!

lesteve · 2022-07-18T13:49:14Z

I think this one can be closed since #47433 has made the what's new entry clearer and #47621 has removed the warnings in the unwanted edge cases.

glemaitre · 2022-07-20T12:02:47Z

Indeed.

phofl added Copy / view semantics Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 16, 2022

lesteve mentioned this issue Jun 20, 2022

DOC clarify inplace operation section in 1.5 whats_new #47433

Merged

simonjayhawkins added this to the 1.5 milestone Jun 25, 2022

phofl mentioned this issue Jul 7, 2022

WARN: Don't show FutureWarning when enlarging df with iloc #47621

Merged

5 tasks

glemaitre closed this as completed Jul 20, 2022

hagenw mentioned this issue Oct 18, 2022

Avoid pandas warning audeering/audformat#319

Closed

glemaitre mentioned this issue Nov 25, 2022

ENH Extend PDP for nominal categorical features scikit-learn/scikit-learn#18298

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New behaviour regarding inplace values setting with iloc #47381

New behaviour regarding inplace values setting with iloc #47381

glemaitre commented Jun 16, 2022

phofl commented Jun 16, 2022

glemaitre commented Jun 16, 2022

phofl commented Jun 16, 2022

glemaitre commented Jun 16, 2022

phofl commented Jun 16, 2022

lesteve commented Jun 20, 2022 •

edited

Loading

jorisvandenbossche commented Jun 22, 2022

lesteve commented Jun 22, 2022

lesteve commented Jul 18, 2022

glemaitre commented Jul 20, 2022

New behaviour regarding inplace values setting with iloc #47381

New behaviour regarding inplace values setting with iloc #47381

Comments

glemaitre commented Jun 16, 2022

phofl commented Jun 16, 2022

glemaitre commented Jun 16, 2022

phofl commented Jun 16, 2022

glemaitre commented Jun 16, 2022

phofl commented Jun 16, 2022

lesteve commented Jun 20, 2022 • edited Loading

jorisvandenbossche commented Jun 22, 2022

lesteve commented Jun 22, 2022

lesteve commented Jul 18, 2022

glemaitre commented Jul 20, 2022

lesteve commented Jun 20, 2022 •

edited

Loading