Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: .dt accessor not available after loc-assignment #53172

Closed
3 tasks done
mj0nez opened this issue May 10, 2023 · 2 comments
Closed
3 tasks done

BUG: .dt accessor not available after loc-assignment #53172

mj0nez opened this issue May 10, 2023 · 2 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@mj0nez
Copy link

mj0nez commented May 10, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

records = [
    {"ts": "2022-01-01 00:00:00"},
    {"ts": "2022-01-01 00:15:00"},
    {"ts": "2022-01-01 00:30:00"},
]

df = pd.DataFrame(records)

pd.to_datetime(df.loc[:,"ts"]).dt  # accessor available

df.loc[:,"ts"] = pd.to_datetime(df.loc[:,"ts"])

df.loc[:,"ts"].dt  # raises an AttributeError

Issue Description

After upgrading pandas from 1.5.3 I stumbled over the traceback below.
While assigning a Series of Timestamps to a DataFrame via loc we lose the dt-accessor on this column.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[18], line 1
----> 1 df.loc[:,"ts"].dt

File c:\Users\mj0nez\repositories\project\.venv\Lib\site-packages\pandas\core\generic.py:5989, in NDFrame.__getattr__(self, name)
   5982 if (
   5983     name not in self._internal_names_set
   5984     and name not in self._metadata
   5985     and name not in self._accessors
   5986     and self._info_axis._can_hold_identifiers_and_holds_name(name)
   5987 ):
   5988     return self[name]
-> 5989 return object.__getattribute__(self, name)

File c:\Users\mj0nez\repositories\project\.venv\Lib\site-packages\pandas\core\accessor.py:224, in CachedAccessor.__get__(self, obj, cls)
    221 if obj is None:
    222     # we're accessing the attribute of the class, i.e., Dataset.geo
    223     return self._accessor
--> 224 accessor_obj = self._accessor(obj)
    225 # Replace the property with the accessor object. Inspired by:
    226 # https://www.pydanny.com/cached-property.html
    227 # We need to use object.__setattr__ because we overwrite __setattr__ on
    228 # NDFrame
    229 object.__setattr__(obj, self._name, accessor_obj)

File c:\Users\mj0nez\repositories\project\.venv\Lib\site-packages\pandas\core\indexes\accessors.py:580, in CombinedDatetimelikeProperties.__new__(cls, data)
    577 elif is_period_dtype(data.dtype):
    578     return PeriodProperties(data, orig)
--> 580 raise AttributeError("Can only use .dt accessor with datetimelike values")

AttributeError: Can only use .dt accessor with datetimelike values

Contrary to the example above, with the following assignment the accessor is available:

df["ts"] = pd.to_datetime(df["ts"])

Is this an user error or a bug?

Expected Behavior

Because we the values in this column are Timestamps, the dt-accessor should be available.

Installed Versions

INSTALLED VERSIONS

commit : a90fbc8
python : 3.11.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19045
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : de_DE.cp1252

pandas : 2.1.0.dev0+748.ga90fbc867c
numpy : 1.25.0.dev0+1357.ga2d21d8ac
pytz : 2023.3
dateutil : 2.8.2
setuptools : 65.5.0
pip : 23.1.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

@mj0nez mj0nez added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 10, 2023
@asishm-wk
Copy link

asishm-wk commented May 11, 2023

I think there are other similar open issues.

The core issue/behavior change here is that df.loc[indexer: col] = something tries to avoid changing the dtype of the original series (even when there is no indexer). I believe df['ts'].dtype would still show it as object.

@mj0nez
Copy link
Author

mj0nez commented May 11, 2023

You are right! The issue seems to be a duplicate of #52593

@mj0nez mj0nez closed this as completed May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants