BUG: Incorrect values shown by pivot_table() #21378

uds5501 · 2018-06-08T06:46:50Z

Code Sample, a copy-pastable example if possible

I was going through the issue #21370 and started out using non np.nan categories but then came across this output. Moreover, using the initial code as specified there, I still get NaN outputs despite attention to that detail in this PR : #21252 : BUG: dropna incorrect with categoricals in pivot_table

# Your code here
>>> pd.__version__
'0.23.0'
>>> import pandas as pd
>>> df=pd.DataFrame({"A":pd.Categorical(['left','low','high','low','high'],categories=['low','high','left'],ordered=True),"B":range(5)})

>>> result = df.pivot_table(index='A', values='B')
>>> result
      B
A      
left  2
low   3
high  0

Problem description

I am new to pivot tables and all but I feel that the default aggregation of mean is showing incorrectly. More like it has been cyclicly passed up by one value. Note , it is showing the same for almost every aggregation.

Expected Output

      B
A      
left  0
low   2
high  3

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.6.4.final.0
python-bits: 32
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: English_India.1252

pandas: 0.23.0
pytest: 3.3.1
pip: 10.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.1
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

gfyoung · 2018-06-08T09:06:47Z

@uds5501 : Have you tried your code against master ? From the code sample, it looks like you're still using 0.23.0, which wouldn't have the patch.

uds5501 · 2018-06-08T09:38:18Z

@gfyoung here is the output after building from master

>>> import pandas as pd
>>> pd.__version__
'0.24.0.dev0+51.g4274b840e'
>>> df=pd.DataFrame({"A":pd.Categorical(['left','low','high','low','high'],categories=['low','high','left'],ordered=True),"B":range(5)})
>>> result = df.pivot_table(index='A', values='B')
>>> result
      B
A
left  2
low   3
high  0

gfyoung · 2018-06-08T09:48:59Z

Weird, I actually get your expected output when running the code you just provided.

uds5501 · 2018-06-08T09:52:05Z

@gfyoung recloning master, I will try it again after setting up and let you know

uds5501 · 2018-06-08T10:47:16Z

okay, did the new build from master thing and it works. Thanks for guidance @gfyoung

 import pandas as pd
>>> pd.__version__
'0.24.0.dev0+72.gabfac97b2'
>>> df=pd.DataFrame({"A":pd.Categorical(['left','low','high','low','high'],categories=['low','high','left'],ordered=True),"B":range(5)})
>>> result = df.pivot_table(index='A', values='B')
>>> result
      B
A
low   2
high  3
left  0

gfyoung · 2018-06-08T17:17:52Z

@uds5501 : Actually, could you add your example as a test? That would be the best way to close this out.

uds5501 · 2018-06-08T17:42:51Z

@gfyoung Okay, I will do that soon

BUG: Incorrect values shown by pivot_table() pandas-dev#21378

…ble() #21378 (#21393)

…ble() pandas-dev#21378 (pandas-dev#21393)

gfyoung added Groupby Categorical Categorical Data Type labels Jun 8, 2018

uds5501 closed this as completed Jun 8, 2018

gfyoung reopened this Jun 8, 2018

gfyoung added the Testing pandas testing functions or related to the test suite label Jun 8, 2018

uds5501 added a commit to uds5501/pandas that referenced this issue Jun 8, 2018

Update test_pivot.py

27852bc

BUG: Incorrect values shown by pivot_table() pandas-dev#21378

uds5501 mentioned this issue Jun 8, 2018

TST: adding test cases for verifying correct values shown by pivot_table() #21378 #21393

Merged

4 tasks

jreback added this to the 0.23.2 milestone Jun 13, 2018

jorisvandenbossche modified the milestones: 0.23.2, 0.24.0 Jun 14, 2018

jreback closed this as completed in #21393 Jun 15, 2018

jreback pushed a commit that referenced this issue Jun 15, 2018

TST: adding test cases for verifying correct values shown by pivot_ta…

9e982e1

…ble() #21378 (#21393)

david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this issue Jun 18, 2018

TST: adding test cases for verifying correct values shown by pivot_ta…

b9650a1

…ble() pandas-dev#21378 (pandas-dev#21393)

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018

TST: adding test cases for verifying correct values shown by pivot_ta…

8967d3a

…ble() pandas-dev#21378 (pandas-dev#21393)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Incorrect values shown by pivot_table() #21378

BUG: Incorrect values shown by pivot_table() #21378

uds5501 commented Jun 8, 2018 •

edited

Loading

INSTALLED VERSIONS

gfyoung commented Jun 8, 2018 •

edited

Loading

uds5501 commented Jun 8, 2018

gfyoung commented Jun 8, 2018 •

edited

Loading

uds5501 commented Jun 8, 2018

uds5501 commented Jun 8, 2018

gfyoung commented Jun 8, 2018

uds5501 commented Jun 8, 2018

BUG: Incorrect values shown by pivot_table() #21378

BUG: Incorrect values shown by pivot_table() #21378

Comments

uds5501 commented Jun 8, 2018 • edited Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

gfyoung commented Jun 8, 2018 • edited Loading

uds5501 commented Jun 8, 2018

gfyoung commented Jun 8, 2018 • edited Loading

uds5501 commented Jun 8, 2018

uds5501 commented Jun 8, 2018

gfyoung commented Jun 8, 2018

uds5501 commented Jun 8, 2018

uds5501 commented Jun 8, 2018 •

edited

Loading

Output of `pd.show_versions()`

gfyoung commented Jun 8, 2018 •

edited

Loading

gfyoung commented Jun 8, 2018 •

edited

Loading