-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slicing with lists in multiple axes #433
Comments
Am I correct in assuming that in this case you'd like the shape of the output to be |
And if so is it important to you that this be the default array slicing syntax or are you ok with some other custom method? |
Correct. I'm looking for the default numpy slicing behavior, how we get there is up for discussion. My preference would be to use the numpy slicing syntax, but could use a |
They're both the same level of difficulty to accomplish. I mostly want to avoid locking dask into one behavior or the other for as long as possible. |
NumPy has been discussing adding I suggest that dask should follow NumPy's lead here, and consider implementing both attributes -- even if it will only support a limited subset of vectorized indexing. Outer indexing is easier to reason about and optimize (dask already has all the necessary functionality), and it will be useful to have explicit syntax both to support better fusing of getitem calls and to make things simpler for downstream libraries like xray (which uses outer indexing in Dask certainly should not try to exactly replicate NumPy's current indexing behavior, which sometimes but not always reorders axes with array indices, e.g.,
|
This is equivalent to numpy slicing with multiple input lists. We could use a better name. cc @shoyer @jhamman Example ------- >>> x = np.arange(56).reshape((7, 8)) >>> x array([[ 0, 1, 2, 3, 4, 5, 6, 7], [ 8, 9, 10, 11, 12, 13, 14, 15], [16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47], [48, 49, 50, 51, 52, 53, 54, 55]]) >>> d = from_array(x, chunks=(3, 4)) >>> result = isel(d, [0, 1, 6, 0], [0, 1, 0, 7]) >>> result.compute() array([ 0, 9, 48, 7]) Fixes dask#433
OK, I've implemented this (I think) in #439 . It could use a better name. Happy to use |
This is equivalent to numpy slicing with multiple input lists. We could use a better name. cc @shoyer @jhamman Example ------- >>> x = np.arange(56).reshape((7, 8)) >>> x array([[ 0, 1, 2, 3, 4, 5, 6, 7], [ 8, 9, 10, 11, 12, 13, 14, 15], [16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47], [48, 49, 50, 51, 52, 53, 54, 55]]) >>> d = from_array(x, chunks=(3, 4)) >>> result = isel(d, [0, 1, 6, 0], [0, 1, 0, 7]) >>> result.compute() array([ 0, 9, 48, 7]) Fixes dask#433
Co-authored-by: crusaderky <crusaderky@gmail.com>
From the From the dask docs:
Here's that issue.
My use case is for point-wise indexing in xray: pydata/xarray#475
A simple use case using dask arrays:
currently raises this error:
cc @shoyer
The text was updated successfully, but these errors were encountered: