-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: to_csv extra header line with multiindex columns #6618
Comments
yes, its for the row index-names (they are None here), in theory could could not print it as the |
Ah, okay. |
@dsm054 I think its reasonable to do a PR which takes out the line and see if anything breaks....(obviously a tests which exactly is supposed to match won't), but I am talking about the read_csv should still work correctly. and I guess its more in-line with what you'd except. |
hi i have the same issue, any workarround how to not have this empty line there? |
pandas will read this format |
yes pandas will, but I need an output without this extra line (it's an input for other application) |
you can use tupleize_cols=True to make the header in a single line |
closes #14515 This commit fixes a bug where `read_csv` failed when given a file with a multiindex header and empty content. Because pandas reads index names as a separate line following the header lines, the reader looks for the line with index names in it. If the content of the dataframe is empty, the reader will choke. This bug surfaced after #6618 stopped writing an extra line after multiindex columns, which led to a situation where pandas could write CSV's that it couldn't then read. This commit changes that behavior by explicitly checking if the index name row exists, and processing it correctly if it doesn't. Author: Ben Kandel <ben.kandel@gmail.com> Closes #14596 from bkandel/fix-parse-empty-df and squashes the following commits: 32e3b0a [Ben Kandel] lint e6b1237 [Ben Kandel] lint fedfff8 [Ben Kandel] fix multiindex column parsing 518982d [Ben Kandel] move to 0.19.2 fc23e5c [Ben Kandel] fix errant this_columns 3d9bbdd [Ben Kandel] whatsnew 68eadf3 [Ben Kandel] Modify test. 17e44dd [Ben Kandel] fix python parser too 72adaf2 [Ben Kandel] remove unnecessary test bfe0423 [Ben Kandel] typo 2f64d57 [Ben Kandel] pep8 b8200e4 [Ben Kandel] BUG: read_csv with empty df
closes pandas-dev#14515 This commit fixes a bug where `read_csv` failed when given a file with a multiindex header and empty content. Because pandas reads index names as a separate line following the header lines, the reader looks for the line with index names in it. If the content of the dataframe is empty, the reader will choke. This bug surfaced after pandas-dev#6618 stopped writing an extra line after multiindex columns, which led to a situation where pandas could write CSV's that it couldn't then read. This commit changes that behavior by explicitly checking if the index name row exists, and processing it correctly if it doesn't. Author: Ben Kandel <ben.kandel@gmail.com> Closes pandas-dev#14596 from bkandel/fix-parse-empty-df and squashes the following commits: 32e3b0a [Ben Kandel] lint e6b1237 [Ben Kandel] lint fedfff8 [Ben Kandel] fix multiindex column parsing 518982d [Ben Kandel] move to 0.19.2 fc23e5c [Ben Kandel] fix errant this_columns 3d9bbdd [Ben Kandel] whatsnew 68eadf3 [Ben Kandel] Modify test. 17e44dd [Ben Kandel] fix python parser too 72adaf2 [Ben Kandel] remove unnecessary test bfe0423 [Ben Kandel] typo 2f64d57 [Ben Kandel] pep8 b8200e4 [Ben Kandel] BUG: read_csv with empty df (cherry picked from commit f862b52)
It appears this bugs still manifests itself if using to_excel: >>> df = pd.DataFrame([[1,2,3],[4,5,6]], columns=pd.MultiIndex.from_tuples([('A',''),('B','C'),('B','D')]))
>>> df
A B
C D
0 1 2 3
1 4 5 6
>>> df.to_excel("out.xlsx") This outputs the spreadsheet with an additional (and pretty much useless) blank line (3): This is some workaround, but is really worse for cases like my example, where the upper row in the MultiIndex gets repeated (see |
A workaround to fix this is to save the headers and table contents separately.
This trick can also be used when saving pandas style to Excel, because pandas style doesn't support multiindex. |
The solution given above by tsznxx worked for me, however a line:
|
This seems strange to me, but I don't often use a MultiIndex so I might be missing something obvious.
Is there supposed to be that empty line at the end of the header? Compare
The text was updated successfully, but these errors were encountered: