Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame created with tzinfo cannot use to_dict(orient="records") #18372

Closed
bolkedebruin opened this issue Nov 19, 2017 · 7 comments · Fixed by #18416
Closed

BUG: DataFrame created with tzinfo cannot use to_dict(orient="records") #18372

bolkedebruin opened this issue Nov 19, 2017 · 7 comments · Fixed by #18416
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Milestone

Comments

@bolkedebruin
Copy link
Contributor

bolkedebruin commented Nov 19, 2017

Code Sample, a copy-pastable example if possible

import psycopg2
import datetime
import pandas as pd

data = [(datetime.datetime(2017, 11, 18, 21, 53, 0, 219225, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=60, name=None)),), (datetime.datetime(2017, 11, 18, 22, 6, 30, 61810, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=60, name=None)),)]

df = pd.DataFrame(list(x))

df.to_dict(orient='records') # fails

Problem description

The above code fails in pandas 0.21.0 with

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-34cf441e3f50> in <module>()
----> 1 df.to_dict(orient='records')

/Users/bolke/Documents/dev/airflow_env/lib/python2.7/site-packages/pandas-0.21.0-py2.7-macosx-10.12-x86_64.egg/pandas/core/frame.pyc in to_dict(self, orient)
    897             return [dict((k, _maybe_box_datetimelike(v))
    898                          for k, v in zip(self.columns, row))
--> 899                     for row in self.values]
    900         elif orient.lower().startswith('i'):
    901             return dict((k, v.to_dict()) for k, v in self.iterrows())

TypeError: izip argument #2 must support iteration

Expected Output

Dict of records

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Darwin
OS-release: 17.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.21.0
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: 5.5.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: 1.1.4
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype labels Nov 19, 2017
@jreback jreback added this to the Next Major Release milestone Nov 19, 2017
@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

yeah this looks buggy. a PR to fix would be great!

@bolkedebruin
Copy link
Contributor Author

If you can give me some hints where to look that would be appreciated. I ‘hacked’ the code in core/frame.py to check for iterable, but I don’t think that is really the fix (or is it?).

@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

see also https://github.com/pandas-dev/pandas/pull/18167/files

diff --git a/pandas/core/frame.py b/pandas/core/frame.py
index 7145fa7..02a1cf4 100644
--- a/pandas/core/frame.py
+++ b/pandas/core/frame.py
@@ -994,7 +994,7 @@ class DataFrame(NDFrame):
         elif orient.lower().startswith('r'):
             return [into_c((k, _maybe_box_datetimelike(v))
                            for k, v in zip(self.columns, row))
-                    for row in self.values]
+                    for row in np.atleast_2d(self.values)]
         elif orient.lower().startswith('i'):
             return into_c((k, v.to_dict(into)) for k, v in self.iterrows())
         else:

The issue is that a single column of datetimetz returns a 1-d array (rather than a 2-d array) when using .values. This is a more general issue. This will solve it here.

@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

xref #13407

@bolkedebruin
Copy link
Contributor Author

Thx! I’ll have a look (to add tests) - or are you applying this patch yourself?

@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

nope love for a PR (test and whatsnew note)

@bolkedebruin
Copy link
Contributor Author

Will do. I will need to change your code slightly as it is not the proper fix, but it put me on the right path

bolkedebruin added a commit to bolkedebruin/incubator-superset that referenced this issue Nov 20, 2017
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929
mistercrunch pushed a commit to apache/superset that referenced this issue Nov 20, 2017
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes #1929
bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 21, 2017
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372
bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 21, 2017
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372
bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 21, 2017
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372
bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 22, 2017
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372
@jreback jreback modified the milestones: Next Major Release, 0.22.0, 0.21.1 Nov 22, 2017
bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 23, 2017
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372
bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 23, 2017
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this issue Dec 8, 2017
TomAugspurger pushed a commit that referenced this issue Dec 11, 2017
michellethomas pushed a commit to michellethomas/panoramix that referenced this issue May 24, 2018
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929
wenchma pushed a commit to wenchma/incubator-superset that referenced this issue Nov 16, 2018
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants