You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
to_dict() extracts the elements from a Series as different types depending on whether or not the series was accessed by, e.g. loc[0] on a DataFrame or not:
Note that the number is a base int when extracted with s.to_dict(), but it's a numpy.int64 when extracted from df.loc[0]. The same inconsistency applies to tolist().
Is this inconsistency a feature or a bug? And if it's a feature, does anyone know how do I reliably extract the values of a row from a DataFrame in base python types, using either to_dict() or tolist()?
output of pd.show_versions()
commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-56-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
The text was updated successfully, but these errors were encountered:
mikepqr
changed the title
Inconsistent types in output of series.to_dict() and df.loc[0].to_dict()
Inconsistent types in output of series.to_dict() and DataFrame([s]).loc[0].to_dict()
Jul 28, 2016
mikepqr
changed the title
Inconsistent types in output of series.to_dict() and DataFrame([s]).loc[0].to_dict()
Inconsistent types in output of series.to_dict() and DataFrame([series]).loc[0].to_dict()
Jul 28, 2016
My use case was the same as #9108: I wanted to assemble a somewhat complicated JSON object that contains things that aren't only in the DataFrame. pd.io.json.dumps on the dictionary works fine. So my problem is solved. Thanks!
Just out of interest then: is the inconsistency in the types being stored in s.iloc[1] and df.iloc[0, 1] correct behaviour?
Just out of interest then: is the inconsistency in the types being stored in s.iloc[1] and df.iloc[0, 1] correct behaviour?
Yeah, I think so. Series has to have a single dtype, which must be object in this case since you have mixed types (not a good idea in general). That means we can optimize to a numpy dtype. When you go to a DataFrame each col can have it's own type, which will use NumPy if possible. A good comparison is to pd.Series([1, 2]), which does use numpy ints, even though you pass in python ints.
to_dict()
extracts the elements from a Series as different types depending on whether or not the series was accessed by, e.g.loc[0]
on a DataFrame or not:Note that the number is a base int when extracted with
s.to_dict()
, but it's anumpy.int64
when extracted fromdf.loc[0]
. The same inconsistency applies totolist()
.Is this inconsistency a feature or a bug? And if it's a feature, does anyone know how do I reliably extract the values of a row from a DataFrame in base python types, using either
to_dict()
ortolist()
?output of
pd.show_versions()
commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-56-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.1
nose: None
pip: 1.5.6
setuptools: 12.2
Cython: 0.24
numpy: 1.11.0
scipy: 0.17.1
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: