REGR: ufunc with DataFrame input not passing all kwargs #40878

mzeitlin11 · 2021-04-11T13:36:47Z

closes BUG: numpy functions (eg, np.add) on DataFrames with 'out' parameter no longer work properly #40662
tests added / passed
Ensure all linting tests pass, see here for how to run them
whatsnew entry

Credit to @attack68 for the fix, just figured we should try to get this is in for 1.2.4 if possible since we know the fix.

phofl

minor comment, otherwise lgtm

pandas/tests/frame/test_ufunc.py

simonjayhawkins

Thanks @mzeitlin11 for the PR. we want to make sure that the return type is unchanged.

simonjayhawkins · 2021-04-11T14:22:15Z

pandas/tests/frame/test_ufunc.py

+    func(df, 1, out=result)
+
+    expected = np.array(expected).reshape(2, 2)
+    tm.assert_numpy_array_equal(result, expected)


can you also test the return type/values of the operation itself.

simonjayhawkins

Thanks @mzeitlin11 lgtm

simonjayhawkins · 2021-04-11T14:47:27Z

doc/source/whatsnew/v1.2.4.rst

@@ -21,6 +21,7 @@ Fixed regressions
 - Fixed regression in :meth:`DataFrame.where` not returning a copy in the case of an all True condition (:issue:`39595`)
 - Fixed regression in :meth:`DataFrame.replace` raising ``IndexError`` when ``regex`` was a multi-key dictionary (:issue:`39338`)
 - Fixed regression in repr of floats in an ``object`` column not respecting ``float_format`` when printed in the console or outputted through :meth:`DataFrame.to_string`, :meth:`DataFrame.to_html`, and :meth:`DataFrame.to_latex` (:issue:`40024`)
+- Fixed regression in ``numpy`` ufuncs such as ``np.add`` not passing through all arguments for 2-dimensional input (:issue:`40662`)


Suggested change

- Fixed regression in ``numpy`` ufuncs such as ``np.add`` not passing through all arguments for 2-dimensional input (:issue:`40662`)

- Fixed regression in NumPy ufuncs such as ``np.add`` not passing through all arguments for 2-dimensional input (:issue:`40662`)

and maybe replace 2-dimensional input with :class:DataFrame?

Yep will do

Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com>

doc/source/whatsnew/v1.2.4.rst

…ufunc

phofl · 2021-04-11T17:54:07Z

lgtm Ci failure is because of numpy dev issue

simonjayhawkins · 2021-04-11T18:29:19Z

pandas/core/arraylike.py

@@ -357,7 +357,7 @@ def reconstruct(result):
        # * len(inputs) > 1 is doable when we know that we have
        #   aligned blocks / dtypes.
        inputs = tuple(np.asarray(x) for x in inputs)
-        result = getattr(ufunc, method)(*inputs)
+        result = getattr(ufunc, method)(*inputs, **kwargs)


the else clause is reached for a DataFrame when not (len(inputs) > 1 or ufunc.nout > 1).

The result = mgr.apply(getattr(ufunc, method)) there for __call__ also fails to pass along **kwargs

Is it straightforward to add to the paramterised here to exercise that path.

Yep, I think so...will let you know if I run into any issues

Added a test, but had to xfail because out is not written to correctly. The written result becomes transposed with the block-wise apply, I'm guessing because of the following relationship:

arr = np.array([[1, 2], [3, 4]]) df = pd.DataFrame(arr) print(df.values) print(df._mgr.blocks[0].values)

gives

[[1 2] [3 4]] [[1 3] [2 4]]

Yeah, basically when out is specified, I think we should simply not call mgr.apply, see #39275, #39260 for a recent similar case where certain additional keyword arguments cannot be handled on a block-by-block case.

jreback · 2021-04-12T12:01:52Z

pandas/tests/frame/test_ufunc.py

+        result = func(df, arg, out=result_inplace)
+
+    expected = np.array(expected).reshape(2, 2)
+    tm.assert_numpy_array_equal(result_inplace, expected)


do we have any documentation that out is actually a ndarray? this is a very strange result. At the very least document this, and let's open an issue. This should work with out=DataFrame, or simply raise (preferred)

yep. I thought this was strange. #40662 (comment)

should we defer to 1.2.5 to allow more discussion? or put 1.2.4 release on hold for a day or two?

yeah let's defer this. i dont' think this is the correct behavior and we should actually fix it (which may mean that we simply do this for 1.3)

moving to 1.2.5 (if we do it)

This should work with out=DataFrame, or simply raise (preferred)

This does raise when passing a DataFrame to out (as numpy expects an array as out argument), but what is being tested here is the return value of the ufunc in case of passing an array to out. In that case, out is also being returned, and since out is an ndarray, the return value is also an ndarray.

(to me this seems correct behaviour, and thus I don't think this needs to hold up merging this for the release)

This "terrible" behaviour is long-standing (and IMO correct) behaviour on which users rely.

how is this correct in any way? this was a bug from the original impl.

That was not a bug, that's how numpy functions work: they coerce array-like input to arrays. So whether one of the arguments was a DataFrame vs an ndarray, did not have any effect on allowing an out argument or not

ok fine. how about a follow to make this really clear (in docs) on master. I would however also deprecate / remove this as it not intuitive at all (e.g. we do not have an out argument anywhere else)

e.g. we do not have an out argument anywhere else

This keyword is not in a pandas function or method, but in a NumPy function (which has out in many places)

i know, but this is very confusing if someone is doing this (and doesnt; realize it).

jorisvandenbossche · 2021-04-12T12:18:00Z

pandas/core/arraylike.py

@@ -367,7 +367,7 @@ def reconstruct(result):
        if method == "__call__":


Suggested change

if method == "__call__":

if method == "__call__" and kwargs.get("out", None) is None:

(untested, but if out is specified, we should never call mgr.apply)

Or maybe more simply check for any kwargs, so only do this if kwargs is empty (so and not kwargs)

I quickly pushed this edit myself, as it fixes the xfail case in the tests you added

Thanks for adding this!

jorisvandenbossche · 2021-04-12T14:27:38Z

Thanks @mzeitlin11 !

simonjayhawkins · 2021-04-12T14:29:28Z

@meeseeksdev backport 1.2.x

…ssing all kwargs

…kwargs (#40895) Co-authored-by: Matthew Zeitlin <37011898+mzeitlin11@users.noreply.github.com>

…0878) Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com> Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

mzeitlin11 added 3 commits April 11, 2021 09:21

REGR: ufunc args not being passed

138d9d2

Add whatsnew

3b492e4

Fix typo

625ce36

phofl reviewed Apr 11, 2021

View reviewed changes

pandas/tests/frame/test_ufunc.py Outdated Show resolved Hide resolved

simonjayhawkins added this to the 1.2.4 milestone Apr 11, 2021

simonjayhawkins added the Compat pandas objects compatability with Numpy or Python functions label Apr 11, 2021

Change shape

6bf0da0

simonjayhawkins mentioned this pull request Apr 11, 2021

DOC: 1.2.4 release date #40880

Merged

mzeitlin11 added 2 commits April 11, 2021 10:18

Parameterize test

7c9a63f

Remove unused

cfb0bcd

simonjayhawkins requested changes Apr 11, 2021

View reviewed changes

Also test return value

58f7399

simonjayhawkins approved these changes Apr 11, 2021

View reviewed changes

Update doc/source/whatsnew/v1.2.4.rst

4590926

Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com>

mzeitlin11 commented Apr 11, 2021

View reviewed changes

doc/source/whatsnew/v1.2.4.rst Outdated Show resolved Hide resolved

mzeitlin11 and others added 3 commits April 11, 2021 10:59

Update doc/source/whatsnew/v1.2.4.rst

d58677e

Add subtract also

9c0b96c

Merge branch 'regr/ufunc' of github.com:/mzeitlin11/pandas into regr/…

e711789

…ufunc

phofl approved these changes Apr 11, 2021

View reviewed changes

simonjayhawkins reviewed Apr 11, 2021

View reviewed changes

Add test for other path

a2da4fc

simonjayhawkins mentioned this pull request Apr 12, 2021

RLS: 1.2.4 #40168

Closed

jreback requested changes Apr 12, 2021

View reviewed changes

jreback modified the milestones: 1.2.4, 1.2.5 Apr 12, 2021

jorisvandenbossche reviewed Apr 12, 2021

View reviewed changes

check kwargs before calling mgr.apply

5c57727

jorisvandenbossche approved these changes Apr 12, 2021

View reviewed changes

jreback approved these changes Apr 12, 2021

View reviewed changes

jreback modified the milestones: 1.2.5, 1.2.4 Apr 12, 2021

jorisvandenbossche changed the title ~~REGR: ufunc not passing all args~~ REGR: ufunc with DataFrame input not passing all kwargs Apr 12, 2021

jorisvandenbossche merged commit aa225b2 into pandas-dev:master Apr 12, 2021

meeseeksmachine mentioned this pull request Apr 12, 2021

Backport PR #40878 on branch 1.2.x (REGR: ufunc with DataFrame input not passing all kwargs) #40895

Merged

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Apr 12, 2021

Backport PR pandas-dev#40878: REGR: ufunc with DataFrame input not pa…

576f73b

…ssing all kwargs

mzeitlin11 deleted the regr/ufunc branch April 12, 2021 15:07

simonjayhawkins pushed a commit that referenced this pull request Apr 12, 2021

Backport PR #40878: REGR: ufunc with DataFrame input not passing all …

0428542

…kwargs (#40895) Co-authored-by: Matthew Zeitlin <37011898+mzeitlin11@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REGR: ufunc with DataFrame input not passing all kwargs #40878

REGR: ufunc with DataFrame input not passing all kwargs #40878

mzeitlin11 commented Apr 11, 2021

phofl left a comment

simonjayhawkins left a comment

simonjayhawkins Apr 11, 2021

mzeitlin11 Apr 11, 2021

simonjayhawkins left a comment

simonjayhawkins Apr 11, 2021

mzeitlin11 Apr 11, 2021

phofl commented Apr 11, 2021

simonjayhawkins Apr 11, 2021

mzeitlin11 Apr 11, 2021

mzeitlin11 Apr 11, 2021

jorisvandenbossche Apr 12, 2021 •

edited

Loading

jreback Apr 12, 2021

simonjayhawkins Apr 12, 2021

jreback Apr 12, 2021

jreback Apr 12, 2021

jorisvandenbossche Apr 12, 2021

jreback Apr 12, 2021

jorisvandenbossche Apr 12, 2021

jreback Apr 12, 2021

jorisvandenbossche Apr 12, 2021

jreback Apr 12, 2021

jorisvandenbossche Apr 12, 2021 •

edited

Loading

jorisvandenbossche Apr 12, 2021

jorisvandenbossche Apr 12, 2021

mzeitlin11 Apr 12, 2021

jorisvandenbossche commented Apr 12, 2021

simonjayhawkins commented Apr 12, 2021

	- Fixed regression in ``numpy`` ufuncs such as ``np.add`` not passing through all arguments for 2-dimensional input (:issue:`40662`)
	- Fixed regression in NumPy ufuncs such as ``np.add`` not passing through all arguments for 2-dimensional input (:issue:`40662`)

		@@ -367,7 +367,7 @@ def reconstruct(result):
		if method == "__call__":

	if method == "__call__":
	if method == "__call__" and kwargs.get("out", None) is None:

REGR: ufunc with DataFrame input not passing all kwargs #40878

REGR: ufunc with DataFrame input not passing all kwargs #40878

Conversation

mzeitlin11 commented Apr 11, 2021

phofl left a comment

Choose a reason for hiding this comment

simonjayhawkins left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjayhawkins left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phofl commented Apr 11, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche Apr 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche Apr 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented Apr 12, 2021

simonjayhawkins commented Apr 12, 2021

jorisvandenbossche Apr 12, 2021 •

edited

Loading

jorisvandenbossche Apr 12, 2021 •

edited

Loading