-
-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v3] First step to generalizes ndarray and bytes #1826
Conversation
This PR is ready for a new round of reviews. As @normanrz suggested, I have removed the inheritance between the two buffer classes
|
src/zarr/buffer.py
Outdated
""" | ||
return cls.from_ndarray_like(np.frombuffer(bytes_like, dtype="b")) | ||
|
||
def as_ndarray_like(self) -> NDArrayLike: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to have an as_memoryview
method that could be used for passing to Blosc, Gzip etc.? Would that also work without copying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An as_memoryview
would work like as_numpy_array
. It might involve copying data depending on what the underlying ndarray-like object the buffer represent. E.g., if self._data
is a cupy array, we have to copy from device to host memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just found the method name as_ndarray_like
confusing because it actually returns a wrapped 1d byte array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, Buffer
is now backed by ArrayLike
: 3854bec
For now, both ArrayLike
and NDArrayLike
are alias of np.ndarray
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet!
Co-authored-by: Norman Rzepka <code@normanrz.com>
This PR introduces two new classes
NDBuffer
andBuffer
to represent the data argument between components.Currently, we use
numpy.ndarray
andbytes
to pass around data between components. As discussed in #1751, it would be good with a generalization of the data containers to enable other memory types.As a first step,
NDBuffer
andBuffer
are backed by numpy arrays. In follow-up PRs, we can discuss/implement alternative backends such as cupy arrays or pytorch tensors. The important point here is to agree on an abstraction that can facilitate a broad range of array/data types.cc. @akshaysubr, @jakirkham