BUG:: Series of timedelta + TimedeltaIndex gets casted to int64 #17250

azjps · 2017-08-14T17:56:12Z

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd, numpy as np

In [2]: df = pd.DataFrame({'td': [np.timedelta64(0, 'ns')]})

In [3]: tds = pd.TimedeltaIndex(['0 days'], dtype='timedelta64[ns]')

In [4]: df.td + tds  # unexpected, returns dtype=int64!
Out[4]: 
0    0
Name: td, dtype: int64

In [5]: df.td + tds._data  # expected, returns dtype=timedelta64
Out[5]: 
0   0 days
Name: td, dtype: timedelta64[ns]

In [6]: df.td + tds.values
Out[6]: 
0   0 days
Name: td, dtype: timedelta64[ns]

In [7]: df["td"].dt.values + tds
Out[7]: TimedeltaIndex(['0 days'], dtype='timedelta64[ns]', freq=None)

Problem description

When adding a TimedeltaIndex and a Series with dtype=timedelta64, one would expect the output to have dtype=timedelta64, but instead it gets promoted to dtype=int64. This is easy to work around by calling .astype("timedelta64[ns]") or using the underlying numpy TimedeltaIndex.values and hence pretty minor, but I figured I'd report it. I believe this regression was introduced around 0.18; the above code works as expected in 0.15.2.

Output of `pd.show_versions()`

```python In [14]: pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.75-el6.x86_64.lime.1
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: None.None

pandas: 0.20.3
pytest: None
pip: 7.1.0
setuptools: 19.4
Cython: None
numpy: 1.13.1
scipy: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
pandas_gbq: None
pandas_datareader: None

</details>

The text was updated successfully, but these errors were encountered:

jreback · 2017-08-15T10:00:21Z

hmm, this works when wrapped with a Series. must not be converting somewhere. welcome for you to have a look. Note the current version of pandas is 0.20.3.

jreback · 2017-08-15T10:02:12Z

simple repro

In [9]: pd.Series(pd.to_timedelta(['1 day'])) + pd.to_timedelta(['1 day'])
Out[9]: 
0    172800000000000
dtype: int64

In [10]: pd.Series(pd.to_timedelta(['1 day'])) + pd.Series(pd.to_timedelta(['1 day']))
Out[10]: 
0   2 days
dtype: timedelta64[ns]

sinhrks · 2017-09-10T06:07:36Z

It is caused by Series constructor which ignores dtype keyword if input is Index. Is it OK to prioritize dtype kwd always?

# NG, should be timedelta64
pd.Series(pd.Int64Index([1, 2, 3]), dtype=np.timedelta64)
# 0    1
# 1    2
# 2    3
# dtype: int64

# OK
pd.Series(np.array([1, 2, 3], dtype=np.int64), dtype=np.timedelta64)
# 0   00:00:00.000000
# 1   00:00:00.000000
# 2   00:00:00.000000
# dtype: timedelta64[ns]

jreback · 2017-09-10T06:15:59Z

actually both of these examples should raise
as np.timedelta64 is not specific enough (iirc we raise in np.datetime64);

of course if ns precision then these should both work

jreback · 2017-09-10T14:10:09Z

So first here is wrong as well (but showing deprecation warning, which is good), should do the same for timedelta64

In [15]: pd.Series(pd.Int64Index([1, 2, 3]), dtype=np.datetime64)
    ...: 
    ...: 
/Users/jreback/miniconda3/envs/pandas/bin/ipython:1: FutureWarning: Passing in 'datetime64' dtype with no frequency is deprecated and will raise in a future version. Please pass in 'datetime64[ns]' instead.
  #!/Users/jreback/miniconda3/envs/pandas/bin/python
Out[15]: 
0    1
1    2
2    3
dtype: int64

In [16]: pd.Series(np.array([1, 2, 3], dtype=np.int64), dtype=np.datetime64)
/Users/jreback/miniconda3/envs/pandas/bin/ipython:1: FutureWarning: Passing in 'datetime64' dtype with no frequency is deprecated and will raise in a future version. Please pass in 'datetime64[ns]' instead.
  #!/Users/jreback/miniconda3/envs/pandas/bin/python
Out[16]: 
0   1970-01-01 00:00:00.000000001
1   1970-01-01 00:00:00.000000002
2   1970-01-01 00:00:00.000000003
dtype: datetime64[ns]

jbrockmendel · 2017-12-29T05:09:20Z

All the examples listed here now work on master.

jreback · 2017-12-29T05:29:00Z

can u do a PR which adds this to the appropriate whatsnew note

and add this example as a test (or point to a similiar test)

jbrockmendel · 2017-12-29T06:24:22Z

can u do a PR which adds this to the appropriate whatsnew note

Moratorium on new PRs for a little while. This is listed in #18824, so it'll happen eventually.

gfyoung added the Timedelta Timedelta data type label Aug 14, 2017

jreback added Bug Difficulty Intermediate labels Aug 15, 2017

jreback added this to the Next Major Release milestone Aug 15, 2017

jreback changed the title ~~Minor: TimedeltaIndex + timedelta series gets casted to int64~~ BUG:: Series of timedelta + TimedeltaIndex gets casted to int64 Aug 15, 2017

jreback added the Datetime Datetime data dtype label Sep 10, 2017

jbrockmendel mentioned this issue Dec 18, 2017

DataFrame vs Series vs Index arithmetic Roundup #18824

Closed

59 tasks

jbrockmendel mentioned this issue Jan 3, 2018

Tests for TDI issues already fixed #19044

Merged

3 tasks

jreback closed this as completed in #19044 Jan 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG:: Series of timedelta + TimedeltaIndex gets casted to int64 #17250

BUG:: Series of timedelta + TimedeltaIndex gets casted to int64 #17250

azjps commented Aug 14, 2017 •

edited

Loading

INSTALLED VERSIONS

jreback commented Aug 15, 2017

jreback commented Aug 15, 2017

sinhrks commented Sep 10, 2017

jreback commented Sep 10, 2017

jreback commented Sep 10, 2017 •

edited

Loading

jbrockmendel commented Dec 29, 2017

jreback commented Dec 29, 2017

jbrockmendel commented Dec 29, 2017

BUG:: Series of timedelta + TimedeltaIndex gets casted to int64 #17250

BUG:: Series of timedelta + TimedeltaIndex gets casted to int64 #17250

Comments

azjps commented Aug 14, 2017 • edited Loading

Code Sample, a copy-pastable example if possible

Problem description

Output of pd.show_versions()

INSTALLED VERSIONS

jreback commented Aug 15, 2017

jreback commented Aug 15, 2017

sinhrks commented Sep 10, 2017

jreback commented Sep 10, 2017

jreback commented Sep 10, 2017 • edited Loading

jbrockmendel commented Dec 29, 2017

jreback commented Dec 29, 2017

jbrockmendel commented Dec 29, 2017

azjps commented Aug 14, 2017 •

edited

Loading

Output of `pd.show_versions()`

jreback commented Sep 10, 2017 •

edited

Loading