-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linear interp with NaNs in nd indexer #4233
Conversation
When interpolating with an nd indexer that contains NaN's, the code previously threw a KeyError from the missing._localize function. This commit fixes this by swapping `np.min` and `np.max` with `np.nanmin` and `np.nanmax`, ignoring any NaN values.
Thanks, @jenssss for sending a PR. |
xarray/core/missing.py
Outdated
imin = index.get_loc(np.min(new_x.values), method="nearest") | ||
imax = index.get_loc(np.max(new_x.values), method="nearest") | ||
imin = index.get_loc(np.nanmin(new_x.values), method="nearest") | ||
imax = index.get_loc(np.nanmax(new_x.values), method="nearest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks that np.nanmin (and nanmax) supports np.datetime-dtype only with numpy>=1.18.
We can copy np.nanmin to our core/npcompat.py and call this function here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they added some low level loops to fix this (numpy/numpy#14841) so guess we cannot copy nanmin over and have to use if LooseVersion(np. __version__) < LooseVersion("1.18")
(or similar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some LooseVersion
checks for numpy>=1.18 in the d044142 commit to the missing._localize
function and to the test. Would this do?
@jenssss, there's something wrong with the commit history (which makes reviewing pretty hard). Did you by change do |
@keewis, sorry I'm still kinda new to Github. |
not your fault, the contributing guide did tell you to do that, and I only updated it a few days ago (not in |
I want to try to fix it. |
…terp This merges an updated upstream into this topic branch
d044142
to
2152fb8
Compare
Okay, I think that should do it. How does it look now? |
perfect, thanks for the fix. |
Looks good - if you want to go a step further you can probably do something along the lines of (untested): if new_x.dtype in "mM" and LooseVersion(np.__version__) < LooseVersion("1.18") and new_x.isnull().any():
raise ValueError("numpy 1.18 or newer required to use interp with datetime/ timedelta array containing missing values")
else:
imin = ... np.nanmin(...)
imax = ... |
This means the PR now also works for numpy < 1.18, as long as index is not with datetime
I misremembered the behavior of the old Sorry for sending you down the wrong path. I can have another look tomorrow. |
It seems that np.min/max works in place of nanmin/nanmax for datetime types for numpy < 1.18, see https://github.com/pydata/xarray/pull/3924/files
896f699
to
8313c3e
Compare
Okay, I removed the I thought it better to use |
Thanks! I have two more suggestions: Can you add an additional line with (["2000-01-01T12:00", "2000-01-02T12:00", "NaT"], [0.5, 1.5]), here: xarray/xarray/tests/test_interp.py Line 575 in 8313c3e
to have a test with a missing datetime. Do you also want to add a note to the docstring, e.g. adding xarray/xarray/core/dataarray.py Line 1425 in a36d0a1
here: xarray/xarray/core/dataarray.py Line 1495 in a36d0a1
Does it also work for DataSets? Then maybe add to the docstring here: Line 2603 in a36d0a1
and here: Line 2731 in a36d0a1
|
Also added a test for `Dataset` to `test_interpolate_nd_with_nan`, and "Missing values are skipped." to the dosctring of `interp` and `interp_like` methods of `DataArray` and `Dataset`.
LGTM. I'll merge in a day or two unless someone else has a comment. (#3924) |
thanks @jenssss I see this is your first PR - welcome to xarray! |
When doing linear interpolatiion with an nd indexer that contains NaN's,
xarray
previously threw aKeyError
from themissing._localize
function. This PR fixes this by swappingnp.min
andnp.max
withnp.nanmin
andnp.nanmax
in that function, ignoring any NaN values.