-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix raise of TypeError when subtracting timedelta array #22054
Conversation
pandas/_libs/tslibs/timedeltas.pyx
Outdated
# raise rathering than letting numpy return wrong answer | ||
return NotImplemented | ||
return op(self.to_timedelta64(), other) | ||
try: | ||
converted_other = other.astype('datetime64[ns]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting is wrong here, since it could be mixed, eg [Timestamp, Timedelta].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mixed test I added passes but I'm guessing this case is why I'm getting problems in other tests? Is there another approach I should take?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah there was a bug in my test. Still looking around but definitely curious on another approach. I was thinking I could iterate over the entire array and do piecemeal conversions but that also seems wrong.
pandas/_libs/tslibs/timedeltas.pyx
Outdated
@@ -929,7 +933,7 @@ cdef class _Timedelta(timedelta): | |||
def nanoseconds(self): | |||
""" | |||
Return the number of nanoseconds (n), where 0 <= n < 1 microsecond. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Want to add a line to lint.sh that checks for trailing white space in cython files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
Actually I'm a little confused now @jbrockmendel, checking on one of the failing tests that I have it is the one for |
The thing is that the line |
Yea adding in the test for |
I'm not sure what lambda you're referring to, but I imagine this will end up looking something like:
|
Oh dang nice! I was just working through something similar in a REPL. I'll push up a revision in the next few minutes. Thanks for the help! EDIT: The lambda I was referring to was op function that gets passed in. |
Ok so I'm not sure why my build wouldn't output anything for the tests that did fail on TravisCI - @jbrockmendel is there a way I can retrigger travis? Is that known to be flakey? Or should I look into something else? I appreciate the help. |
Travis error looks unrelated. When this happens I usually find some typo somewhere to fix and make a dummy commit to force it to re-run. |
ci/lint.sh
Outdated
@@ -49,6 +49,11 @@ if [ "$LINT" ]; then | |||
if [ $? -ne "0" ]; then | |||
RET=1 | |||
fi | |||
|
|||
if [[ -n $(find **/*.pyx -type f -exec egrep -l " +$" {} \;) ]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this print the list of offending files? Can you also include .pxd, .pxi, and .pxi.in? Thanks for stepping up on this, linting these files is really tough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea no worries! It won't display them as output I think just from this test but I can include the same find / grep dance under it or save it as a variable and echo it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I can include the same find / grep dance under it or save it as a variable and echo it.
Great. If something breaks, we definitely want to know where to look for it.
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -441,7 +441,7 @@ Datetimelike | |||
Timedelta | |||
^^^^^^^^^ | |||
|
|||
- | |||
- Fixed bug where array of timestamp and deltas raised a TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Timedelta' (:issue:`21980`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look at how quotation marks and backticks are used elsewhere In this file (iOS keyboard doesn’t make it easy to give an example directly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look!
@@ -1012,3 +1012,22 @@ def test_get_level_values_box(self): | |||
index = MultiIndex(levels=levels, labels=labels) | |||
|
|||
assert isinstance(index.get_level_values(0)[0], Timestamp) | |||
|
|||
def test_diff_sub_timedelta(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests go in tests/scalar/timedelta/test_arithmetic.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will move!
res = arr - pd.Timedelta('1D') | ||
tm.assert_numpy_array_equal(res, exp) | ||
|
||
def test_diff_sub_timedelta_mixed(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR probably also fixes addition right? If so, pls include a test. Possibly also for reversed ops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should and I was thinking about reversed this morning actually so I'll include both today.
@jbrockmendel hmm upon further investigation this AM it seems that when doing the reverse subtract ie arr = np.array([Timestamp('20130101 9:01'),
Timestamp('20121230 9:02')])
exp = np.array([Timestamp('20121231 9:01'),
Timestamp('20121229 9:02')])
res = pd.Timedelta('1D') - arr
tm.assert_numpy_array_equal(res, exp) I get |
The operation being run is |
Hmm actually, it appears that rsub doesn't work originally with
|
Ok so in that case I'll just check that these raise appropriately then. Thanks for all the help! |
Hello @illegalnumbers! Thanks for updating the PR.
Comment last updated on September 06, 2018 at 04:04 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #22054 +/- ##
==========================================
+ Coverage 92.04% 92.05% +<.01%
==========================================
Files 169 169
Lines 50782 50782
==========================================
+ Hits 46744 46745 +1
+ Misses 4038 4037 -1
Continue to review full report at Codecov.
|
0208713
to
bbf1a06
Compare
@jbrockmendel lemme know if there's anything else I need to do to get this guy merged! :) |
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -441,7 +441,7 @@ Datetimelike | |||
Timedelta | |||
^^^^^^^^^ | |||
|
|||
- | |||
- Fixed bug where array of timestamp and deltas raised a TypeError: unsupported operand type(s) for -: ``numpy.ndarray`` and ``Timedelta`` (:issue:`21980`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timestamp --> :class:Timestamp
double back ticks around TypeError
don't need the full error message
Can you clarify what "array of timestamp and deltas" means? e.g.
Bug where subtracting :class:`Timedelta` from an object-dtyped array would raise ``TypeError``
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ci/lint.sh
Outdated
then | ||
RET=1 | ||
echo $trailing_space_pxd | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks. Any other ideas you have for linting cython files will be a big hit with the maintainers (separate PR(s))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great! I am not sure if I'll have much time in the next little bit after this gets merged but I'll do my best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gonna move this out, see other comments.
@@ -616,3 +690,35 @@ def test_rdivmod_invalid(self): | |||
|
|||
with pytest.raises(TypeError): | |||
divmod(np.array([22, 24]), td) | |||
|
|||
def test_td_div_timedelta_timedeltalike_array(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a valid case here be object-dtyped array containing all Timedelta objects (or some mix of timedelta, np.timedelta64)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a good call to do a mix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. (see below this test)
with pytest.raises(TypeError): | ||
pd.Timedelta('1D') * arr | ||
|
||
def test_td_rmult_timedelta_mixed_timedeltalike_array(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these you can probably de-duplicate using pytest.mark.parametrize
. Not a deal-breaker.
If there's a graceful way to work "object_dtype" into the test names that'd be ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a lot more effor to get this de-duped than I initially thought, would it be ok to submit as-is? I added in object_dtype to the method titles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls parameterize in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@illegalnumbers the idea here is to notice that all four of these tests are special cases of a single test function. In particular, you can write arr * pd.Timedelta('1D')
as operator.mul(arr, pd.Timedelta('1D')
and similarly for the others with operator.truediv
, ops.rdiv
, ops.rmul
. Then these can be parametrized as:
@pytest.mark.parametrize('op', [operator.mul, ops.rmul, operator.truediv, ops.rdiv])
def test_td_mul_object_array(self, op):
arr = np.array([pd.Timestamp.now(), pd.Timedelta('1D')])
with pytest.raises(TypeError):
op(arr, pd.Timedelta('1D'))
Gonna see if I can finish this up today! |
9a1bd6f
to
db48fd7
Compare
Took a little while extra but it should be ready for review again. Barring any lint failures of course. |
I'm not sure if these are related to my ps? Kind of hard to read the build output. EDIT: Seems like most are in the excel writer? |
Yes. I think this is fixed now. If you rebase/push it should go through alright. I'll go over this one more time today, after which jreback will be called in for the final OK. |
db48fd7
to
479b064
Compare
Done! Like I said on the parameterize, I did the best I could without getting stuck. There were a few tests that seemed really similar but I kept getting exceptions when I tried to parameterize them so it seemed like more work than it was worth considering my experience. |
479b064
to
20c93d2
Compare
@jbrockmendel I think this should be good again! |
if other.dtype.kind in ['m', 'M']: | ||
return op(self.to_timedelta64(), other) | ||
elif other.dtype.kind == 'O': | ||
return np.array([op(self, x) for x in other]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel I am a bit confused here why simply returning NotImplemented is not sufficient ?
(I tested it, and that doesn't seem to work. Although with a datetime.timedelta
it does, and that one does return NotImplemented ..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pd.Timedelta
version fails because both arr.__add__(td)
and td.__radd__(arr)
return NotImplemented
. arr.__add__(td.to_pytimedelta())
returns OK, so presumably it is something on the numpy implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was playing a bit with it, and I think Timedelta behaves differently as datetime.timedelta because of the __array_priority__
we add to the _Timedelta
class (by commenting it out, the simple example works), which is needed to get other behaviors working I suppose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche ok otherwise this PR looks ok, are you suggesting that we remove that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also confused.
The tests could be parametrized a bit further, but at some point that just becomes equivalent to re-writing the method this PR is implementing. @jreback LGTM. |
I'm super happy to have contributed! This was fun. Sorry it took so long. |
20c93d2
to
d125b12
Compare
thanks! |
The old pandas versions available for Py34 cannot subtract timedeltas from ndarrays. Subtracting them individually works and was used in the fix for later pandas versions: pandas-dev/pandas#22054
closes #21980
git diff upstream/master -u -- "*.py" | flake8 --diff