DataFrame.to_dict returning numpy scalars in certain cases #23753

jorisvandenbossche · 2018-11-17T14:36:25Z

I think in general we try to return python scalars instead of numpy scalars in to_dict (similar as in tolist or iteration).

Eg:

In [27]: df = pd.DataFrame({'a': [1, 2], 'b': [.1, .2]})

In [28]: df.to_dict()
Out[28]: {'a': {0: 1, 1: 2}, 'b': {0: 0.1, 1: 0.2}}

In [29]: type(df.to_dict()['a'][0])
Out[29]: int

However, this is not consistent, and eg when using orient='records':

In [31]: df.to_dict(orient='records')
Out[31]: [{'a': 1.0, 'b': 0.10000000000000001}, {'a': 2.0, 'b': 0.20000000000000001}]

In [32]: type(df.to_dict(orient='records')[0]['a'])
Out[32]: numpy.float64

In this case, that is because of iterating over self.values in the 'records' implementation (which also means that if you have a string column, self.values will be object dtype, and you actually get python scalars)

There are a bunch of other issues related to iteration (eg #20791, #13468), but didn't see one specifically related to to_dict.

The text was updated successfully, but these errors were encountered:

jreback · 2018-11-17T14:53:40Z

pretty sure this is a duplicate issue

jorisvandenbossche · 2018-11-17T14:59:06Z

As I said, I searched for it but didn't see one directly. But if you find one, happy to close this as a duplicate.

For iteration there are other issues, but here for to_dict, it is not only due to iteration of pandas objects, but eg also numpy depending on the orient type, so I think it deserves its own issue.

jorisvandenbossche · 2018-11-17T17:24:11Z

Not directly related to this issue, but: an option to convert missing values to None would also be nice for my use case. Although that might add quite some complexity to the implementation (and you can do it yourself relatively easy)

bourbaki · 2018-11-25T17:50:43Z

@jreback I am working on the issue. The source of it is usage of DataFrame.values property in the most of to_dict orientations. DataFrame.values gathers data from all columns and converts them to typed nd.array

Closes gh-23753

Closes pandas-devgh-23753

jorisvandenbossche added the Bug label Nov 17, 2018

bourbaki mentioned this issue Nov 26, 2018

Proper boxing of scalars in DataFrame.to_dict #23921

Merged

4 tasks

gfyoung added Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations DataFrame DataFrame data structure labels Nov 26, 2018

jreback modified the milestones: 0.24.1, 0.24.0 Dec 2, 2018

gfyoung closed this as completed in #23921 Dec 2, 2018

gfyoung pushed a commit that referenced this issue Dec 2, 2018

Proper boxing of scalars in DataFrame.to_dict (#23921)

92d25f0

Closes gh-23753

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this issue Feb 28, 2019

Proper boxing of scalars in DataFrame.to_dict (pandas-dev#23921)

c85d7ff

Closes pandas-devgh-23753

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this issue Feb 28, 2019

Proper boxing of scalars in DataFrame.to_dict (pandas-dev#23921)

d4393b7

Closes pandas-devgh-23753

boydgreenfield mentioned this issue Apr 3, 2019

Series iteration and to_dict methods *sometimes* return underlying storage type vs. Python object #25969

Open

maximz mentioned this issue Jul 26, 2019

to_dict() on a boolean series sometimes returns numpy types instead of Python types #27616

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.to_dict returning numpy scalars in certain cases #23753

DataFrame.to_dict returning numpy scalars in certain cases #23753

jorisvandenbossche commented Nov 17, 2018 •

edited

Loading

jreback commented Nov 17, 2018

jorisvandenbossche commented Nov 17, 2018

jorisvandenbossche commented Nov 17, 2018

bourbaki commented Nov 25, 2018 •

edited

Loading

DataFrame.to_dict returning numpy scalars in certain cases #23753

DataFrame.to_dict returning numpy scalars in certain cases #23753

Comments

jorisvandenbossche commented Nov 17, 2018 • edited Loading

jreback commented Nov 17, 2018

jorisvandenbossche commented Nov 17, 2018

jorisvandenbossche commented Nov 17, 2018

bourbaki commented Nov 25, 2018 • edited Loading

jorisvandenbossche commented Nov 17, 2018 •

edited

Loading

bourbaki commented Nov 25, 2018 •

edited

Loading