multiindex column in to_excel #2701

hayd · 2013-01-15T19:39:58Z

link to #1651

Saving a multiindex column to_excel saves a sparse index.

Migrated from this StackOverflow question, with a smaller DataFrame.

In [1]: m = MultiIndex.from_tuples([(1,1),(1,2)], names=['a','b'])

In [2]: df = DataFrame([[1,2],[3,4]], columns=m)

In [3]: df
Out[3]: 
a  1   
b  1  2
0  1  2
1  3  4

In [4]: df.to_excel('test.xls')

Saves the xls:

    1.1 .2
0   1   2
1   3   4

This differs from how to_csv (which is not sparse):

,"(1L, 1L)","(1L, 2L)"
0,1,2
1,3,4

The text was updated successfully, but these errors were encountered:

joeb1415 · 2013-03-22T21:07:50Z

Both to_excel and to_csv should allow a multiindex on the columns to print on multiple rows

jreback · 2013-03-22T21:17:25Z

I guess then read_csv would have to take a list for the header argument to reconstruct the mi

ghost · 2013-03-22T21:56:41Z

#2478, this is already hiding in the codebase, just needs some TLC.

jreback · 2013-09-21T17:07:54Z

@hayd @jtratner IIRC this should still be open, right?

jtratner · 2013-09-21T19:38:28Z

yes, definitely.

jmcnamara · 2013-11-10T10:25:00Z

I think that has been fixed by #5423:

import pandas as pd

m = pd.MultiIndex.from_tuples([(1,1),(1,2)], names=['a','b'])
df = pd.DataFrame([[1,2],[3,4]], columns=m)

df.to_excel('test.xls')

Output:

Can we close this?

mappingvermont · 2017-04-27T19:34:43Z

Hi all,

I'm trying to write a multiindex dataframe to Excel using the example above, and not getting the same results.

Code:

import pandas as pd

m = pd.MultiIndex.from_tuples([(1,1),(1,2)], names=['a','b'])
df = pd.DataFrame([[1,2],[3,4]], columns=m)

df.to_excel('test.xls')

Output:

There's an extra line being added-- different than the output generated above.

I'm using pandas version 0.19.2 on Ubuntu 14.04 and have also experienced this on Windows 10. I've tried this using pandas version 0.15.0 and it works properly, replicating the output above:

Is this a bug? And if so, can we re-open this issue?

Thanks,

Charlie

juandavid-bolanos · 2017-07-18T21:50:26Z

@mappingvermont I'm getting the same result with a real example. I think it is a bug. Using latest version pandas 0.19.2

ronanpaixao · 2017-10-25T13:20:39Z

@jreback Can this be reopened? I also get this problem.

I've commented this on #6618, which seems related. From what I can see, the problem is that pandas is reserving the first column of the header rows to the MultiIndex, and creates a new line for the row index name even if it is unnamed:

>>> df = pd.DataFrame([[1,2,3],[4,5,6]], columns=pd.MultiIndex.from_tuples([('A'
,''),('B','C'),('B','D')]))
>>> df.to_excel("out.xlsx", index_label="Foo")

This bug was fixed some time ago for simple columns, but it was probably not a very good solution, since it is still buggy for MultiIndex columns:

>>> df = pd.DataFrame([[1,2,3],[4,5,6]], columns=['A','B','C'])
>>> df.to_excel("out.xlsx", index_label="Foo")

Version information:

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.1.final.0
python-bits: 32
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 42 Stepping 7, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

Tangjiahui26 · 2018-09-14T21:29:21Z

@jreback still got the same problem for the multi-index, wish this issue can be reopened.

etiennecaldo · 2019-05-09T08:34:00Z

Same here, I will open a new issue and try to find the problem

Ronkiro · 2019-06-06T16:13:18Z

@etiennecaldo Did you open the issue?
I'm having the same problem with the latest version of pandas.

etiennecaldo · 2019-06-11T20:20:02Z

@Ronkiro no I workarounded it on my side...!

ferdiaoh · 2019-07-23T09:43:32Z

This still appears to be an issue for multi-index columns (pandas 0.24.2) - is there any way this can be re-opened?

data = [[1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12]]
multi_index = pd.MultiIndex.from_product([['a', 'b'], ['one', 'two', 'three']])
df = pd.DataFrame(data, columns=multi_index_one, index=[1, 2])

df.to_excel('test_files/test.xlsx')

pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Darwin
OS-release: 18.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_IE.UTF-8
pandas: 0.24.2
pytest: 4.1.1
pip: 18.1
setuptools: 40.6.3
Cython: 0.28.2
numpy: 1.15.4
scipy: 1.1.0
pyarrow: 0.11.1
xarray: None
IPython: 7.2.0
sphinx: None
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.2
openpyxl: None
xlrd: 1.2.0
xlwt: None
xlsxwriter: 1.1.8
lxml.etree: 4.3.0
bs4: 4.7.1
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: 0.2.0
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Cattes · 2019-07-23T09:50:59Z

I got the same issue with a dataset with multiple column and row indizes and would really appreciate a solution.

pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.18.0-25-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.4
pytest: 4.5.0
pip: 19.0.3
setuptools: 41.0.0
Cython: None
numpy: 1.15.2
scipy: None
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: 1.2.0
xlwt: None
xlsxwriter: 1.1.2
lxml: None
bs4: None
html5lib: None
sqlalchemy: 1.3.3
pymysql: 0.9.3
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

stefanopassador-bip · 2019-08-05T14:13:11Z

Same issue with pandas 0.25.0

pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 142 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.24.2
pytest: None
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.16.4
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.6.2
xlrd: 1.2.0
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: 0.2.3

jmcnamara · 2019-08-06T08:48:00Z

Rather than replying to a feature request issue that was implemented and closed 6 years ago it would probably be best to open a new issue with an example(s) that demonstrate the issue.

Ronkiro · 2019-08-06T13:22:17Z

For reference, this issue has been created and can be found at #27772 .

For any more comments, please post there.

jreback closed this as completed Nov 10, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiindex column in to_excel #2701

multiindex column in to_excel #2701

hayd commented Jan 15, 2013

joeb1415 commented Mar 22, 2013

jreback commented Mar 22, 2013

ghost commented Mar 22, 2013

jreback commented Sep 21, 2013

jtratner commented Sep 21, 2013

jmcnamara commented Nov 10, 2013

mappingvermont commented Apr 27, 2017

juandavid-bolanos commented Jul 18, 2017

ronanpaixao commented Oct 25, 2017

Tangjiahui26 commented Sep 14, 2018

etiennecaldo commented May 9, 2019 •

edited

Loading

Ronkiro commented Jun 6, 2019

etiennecaldo commented Jun 11, 2019

ferdiaoh commented Jul 23, 2019

Cattes commented Jul 23, 2019

stefanopassador-bip commented Aug 5, 2019

jmcnamara commented Aug 6, 2019

Ronkiro commented Aug 6, 2019

multiindex column in to_excel #2701

multiindex column in to_excel #2701

Comments

hayd commented Jan 15, 2013

joeb1415 commented Mar 22, 2013

jreback commented Mar 22, 2013

ghost commented Mar 22, 2013

jreback commented Sep 21, 2013

jtratner commented Sep 21, 2013

jmcnamara commented Nov 10, 2013

mappingvermont commented Apr 27, 2017

juandavid-bolanos commented Jul 18, 2017

ronanpaixao commented Oct 25, 2017

Tangjiahui26 commented Sep 14, 2018

etiennecaldo commented May 9, 2019 • edited Loading

Ronkiro commented Jun 6, 2019

etiennecaldo commented Jun 11, 2019

ferdiaoh commented Jul 23, 2019

Cattes commented Jul 23, 2019

stefanopassador-bip commented Aug 5, 2019

jmcnamara commented Aug 6, 2019

Ronkiro commented Aug 6, 2019

etiennecaldo commented May 9, 2019 •

edited

Loading