-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix arithmetic errors with timedelta64 dtypes #22390
Conversation
Codecov Report
@@ Coverage Diff @@
## master #22390 +/- ##
==========================================
- Coverage 92.04% 92.04% -0.01%
==========================================
Files 169 169
Lines 50740 50759 +19
==========================================
+ Hits 46705 46722 +17
- Misses 4035 4037 +2
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -576,7 +576,10 @@ Datetimelike | |||
- Bug in :class:`DataFrame` with mixed dtypes including ``datetime64[ns]`` incorrectly raising ``TypeError`` on equality comparisons (:issue:`13128`,:issue:`22163`) | |||
- Bug in :meth:`DataFrame.eq` comparison against ``NaT`` incorrectly returning ``True`` or ``NaN`` (:issue:`15697`,:issue:`22163`) | |||
- Bug in :class:`DataFrame` with ``timedelta64[ns]`` dtype division by ``Timedelta``-like scalar incorrectly returning ``timedelta64[ns]`` dtype instead of ``float64`` dtype (:issue:`20088`,:issue:`22163`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you split out to a timedelta bug section (as this is getting kind of long)?
can be in a future PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we now have a separate section?
@@ -596,6 +597,9 @@ def _evaluate_numeric_binop(self, other): | |||
# GH#19333 is_integer evaluated True on timedelta64, | |||
# so we need to catch these explicitly | |||
return op(self._int64index, other) | |||
elif is_timedelta64_dtype(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't the fall thru raise TypeError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t understand the question
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your comment is at odds with the checking i think. pls revise comments and/or checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment below this line # must be an np.ndarray; GH#22390
along with the check above elif is_timedelta64_dtype(other)
is just saying this is an ndarray["timedelta64['ns']"]
. I don't think these are at odds.
pandas/core/ops.py
Outdated
# timedelta64 dtypes because numpy casts integer dtypes to | ||
# timedelta64 when operating with timedelta64 | ||
if isinstance(right, np.ndarray): | ||
# upcast to TimedeltaIndex before dispatching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uhh this seems like a lot of special casing, any way to make this simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only simplification I considered but didn't use was something like:
def maybe_upcast_for_op(obj):
if type(obj) is timedelta:
return pd.Timedelta(obj)
if isinstance(obj, np.ndarray) and is_timedelta64_dtype(obj):
return pd.TimedeltaIndex(obj)
return obj
which would go right after get_op_result_name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe re-route the scalars to a separate path. This is getting way special casey here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll implement the maybe_upcast idea and see if that is prettier
This will probably need rebasing after #22350 goes through |
pandas/core/ops.py
Outdated
index=left.index, name=res_name, | ||
dtype=result.dtype) | ||
|
||
elif type(right) is datetime.timedelta: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this case you can include above (before what you added), no?
if isintance(right, (datetime.timedelta, Timedelta)):
right = Timedelta(right)
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version code above casts to TimedeltaIndex
, not Timedelta
.
pandas/core/ops.py
Outdated
# timedelta64 dtypes because numpy casts integer dtypes to | ||
# timedelta64 when operating with timedelta64 | ||
if isinstance(right, np.ndarray): | ||
# upcast to TimedeltaIndex before dispatching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe re-route the scalars to a separate path. This is getting way special casey here.
@@ -596,6 +597,9 @@ def _evaluate_numeric_binop(self, other): | |||
# GH#19333 is_integer evaluated True on timedelta64, | |||
# so we need to catch these explicitly | |||
return op(self._int64index, other) | |||
elif is_timedelta64_dtype(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your comment is at odds with the checking i think. pls revise comments and/or checks.
Implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor doc-comment. rebase and ping on green.
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -576,7 +576,10 @@ Datetimelike | |||
- Bug in :class:`DataFrame` with mixed dtypes including ``datetime64[ns]`` incorrectly raising ``TypeError`` on equality comparisons (:issue:`13128`,:issue:`22163`) | |||
- Bug in :meth:`DataFrame.eq` comparison against ``NaT`` incorrectly returning ``True`` or ``NaN`` (:issue:`15697`,:issue:`22163`) | |||
- Bug in :class:`DataFrame` with ``timedelta64[ns]`` dtype division by ``Timedelta``-like scalar incorrectly returning ``timedelta64[ns]`` dtype instead of ``float64`` dtype (:issue:`20088`,:issue:`22163`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we now have a separate section?
ping |
thanks! nice cleaning |
Kind of surprising, but I don't see any Issues for these bugs. Might be because many of them were found after parametrizing arithmetic tests, so instead of Issues we had xfails.
xref #20088
xref #22163
git diff upstream/master -u -- "*.py" | flake8 --diff