MAINT: normalize NDArray to tensors, add a special-case for out= NDArrays #108

ev-br · 2023-04-06T08:27:52Z

Arguments annotated as NDArray get normalized to their Tensors for further processing by implementer functions.
out= arguments however are special (the original array need to be preserved), and are never seen by implementers, so they keep being ndarrays.

So the annotations and their corresponding normalizations are

ArrayLike : a generic numpy array_like --- convert to Tensor, pass the Tensor to implementers
NDArray : ndarray exactly --- check the type to be ndarray, extract its Tensor, pass the Tensor to implenters
OutArray : ndarray exactly --- check the type to be ndarray, return ndarray, handle it on the normalizer level

Arguments annotated as NDArray get normalized to their Tensors for further processing by implementer functions. out= arguments however are special (the original array need to be preserved), and are never seen by implementers, so they keep being ndarrays.

lezcano

Why did you change NDArray to OutArray?

Note that the only use of out is through out.tensor

numpy_pytorch_interop/torch_np/_normalizations.py

Lines 123 to 124 in f30f4c4

    
           out.tensor.copy_(result) 
        
           return out

My best guess is that the Optional[NDArray] annotation we used before was not correct when we had tuples or lists as a return.

For reasons like this, I proposed to move out the out= implementation from the function into the decorator itself, as this input is different to all the others. Now, to avoid the extra packing-unpacking code, we need to make an exception an leak ndarrays into _funcs_impl.py, but well...

torch_np/_normalizations.py

ev-br · 2023-04-06T09:39:59Z

Before this PR:

out: Optional[NDArray] annotation is special. NDArray annotation only checks the type of the argument being isinstance(arg, ndarray) and returns the array. This worked just fine because out arrays never made it to implementers, and maybe_copy_out worked in the normalizer level, exactly via out.tensor.copy_ All was well.
a single exception, copyto(dst: NDarray, src: ArrayLike) leaked an ndarray to implementers.

Now, if we make NDArray annotation return a tensor, we clean this single exception, but get a serious problem across the board with the out= semantics. Once we normalizer the out array into a Tensor we lose information about the original array. Then mabe_copy_out copies data into a tensor --- and wraps this tensor into a new array. So assert result is out is broken across the board

The OutArray annotation is to fix this. In the end of the day, out array is different from just a generic ndarray argument. NumPy almost never requires strict ndarrays --- here this is literally the only exception.

For reasons like this, I proposed to move out the out= implementation from the function into the decorator itself, as this input is different to all the others.

We discussed this a couple of times actually. This breaks using out as a positional argument, allowed by numpy.

So my inclination would be to just accept copyto being special, add a relevant comment to its source and be done with it. But I see that my practicality/purity meter is consistently off.

ev-br · 2023-04-06T09:43:27Z

My best guess is that the Optional[NDArray] annotation we used before was not correct when we had tuples or lists as a return

Yes. This is separate though. I'm fixing this using divmod as a guinea pig now. The current stumbling block is how to check that tuple[Optional[NDArray], Optional[NDArray]] is a typing.Sequence. Googling around suggests checking the argument itself not the annotation, but we cannot do this (e.g. a list is an ArrayLike but also a Sequence).

lezcano · 2023-04-06T10:07:48Z

We discussed this a couple of times actually. This breaks using out as a positional argument, allowed by numpy.

As mentioned in those discussions, it can be done. Here's a simplified example

import inspect

def add_out(f):
    def wrapper(*args, **kwargs):
        if "out" in kwargs:
            out = kwargs.pop("out")
        elif len(args) > len(inspect.getfullargspec(f).args):
            args, out = args[:-1], args[-1]
        print(out)
        result = f(*args, **kwargs)
        return result
    return wrapper


@add_out
def f(x, *, y=3):
    pass

f(2,7)
f(2, out=8)

This could go under a @normalize(has_out=True) flag.

ev-br · 2023-04-06T10:16:49Z

This is still brittle because there is no guarantee that out is the last positional arg. It really can be anywhere in the argument list.

lezcano · 2023-04-06T10:22:47Z

Fine. Let's just have it like this then.

As a separate point, the recursive step in maybe_copy_to calls a non-existent funciton called "copy_to`.

As for the typing issue, why don't you check that it's exactly a tuple, rather than trying to generalize to a Sequence?

lezcano · 2023-04-06T10:33:11Z

As for how to do that last bit, there's no nice way of doing it in the typing module for all I know.
This is how I did it in PrimTorch:
https://github.com/pytorch/pytorch/blob/b8cf01013975dfabe40966b8dc4555a2e069336c/torch/_decomp/__init__.py#L81-L86

ev-br · 2023-04-06T10:33:51Z

Fine. Let's just have it like this then.

This PR, with a special OutArray or an exception in copyto?

lezcano · 2023-04-06T10:36:33Z

This PR as is looks alright. Let's write a comment for what's the rationale behind this though (out no being kwarg only and being on arbitrary positions)

ev-br · 2023-04-06T10:40:50Z

Re copy_to and sequences --- yes, I've noticed and am fixing it. As a side note, we need to set up coverage to smoke out these unusual code paths.

PrimTorch's way is slick indeed. I was hoping there is a way to avoid looking up __origin__ but hey. My current issue is to cook up an incantation to cover both Sequence[ArrayLike] and tuple[Optional[NDarray], Optional[NDArray]] (the latter being the out= annotation for nout=2 ufuncs like divmod).

lezcano · 2023-04-06T10:44:08Z

Do the primtorch trick plus:

>>> import typing
>>> issubclass(tuple, typing.Sequence)
True
>>> issubclass(list, typing.Sequence)
True

lezcano · 2023-04-06T12:21:32Z

torch_np/_normalizations.py

+    from ._ndarray import ndarray
+
+    if not isinstance(arg, ndarray):
+        raise TypeError("'out' must be an array")


We should improve the error message using the name. Just patch this into the next PR.

ev-br · 2023-04-06T14:12:14Z

OK, gh-109 does a minimal fix for out= being tuple only, and avoids doing one more refactor. Let's keep the primtorch trick for if/when we hit limitations of the current infra.

ev-br requested a review from lezcano April 6, 2023 08:27

ev-br mentioned this pull request Apr 6, 2023

add copyto, resize, histogram #106

Merged

lezcano reviewed Apr 6, 2023

View reviewed changes

torch_np/_normalizations.py Show resolved Hide resolved

lezcano approved these changes Apr 6, 2023

View reviewed changes

ev-br force-pushed the ndarray_annot branch from af4768d to f30f4c4 Compare April 6, 2023 10:50

Base automatically changed from dtypes_in_ufuncs to main April 6, 2023 10:54

ev-br merged commit 3887e26 into main Apr 6, 2023

ev-br deleted the ndarray_annot branch April 6, 2023 11:08

lezcano reviewed Apr 6, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT: normalize NDArray to tensors, add a special-case for out= NDArrays #108

MAINT: normalize NDArray to tensors, add a special-case for out= NDArrays #108

ev-br commented Apr 6, 2023

lezcano left a comment

ev-br commented Apr 6, 2023

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023 •

edited

Loading

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023

lezcano commented Apr 6, 2023

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023 •

edited

Loading

lezcano Apr 6, 2023

ev-br Apr 6, 2023

ev-br commented Apr 6, 2023

MAINT: normalize NDArray to tensors, add a special-case for out= NDArrays #108

MAINT: normalize NDArray to tensors, add a special-case for out= NDArrays #108

Conversation

ev-br commented Apr 6, 2023

lezcano left a comment

Choose a reason for hiding this comment

ev-br commented Apr 6, 2023

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023 • edited Loading

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023

lezcano commented Apr 6, 2023

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023 • edited Loading

lezcano Apr 6, 2023

Choose a reason for hiding this comment

ev-br Apr 6, 2023

Choose a reason for hiding this comment

ev-br commented Apr 6, 2023

lezcano commented Apr 6, 2023 •

edited

Loading

lezcano commented Apr 6, 2023 •

edited

Loading