Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writer gives broken Excel File #63

Closed
pgundlach opened this issue Nov 25, 2014 · 40 comments
Closed

Writer gives broken Excel File #63

pgundlach opened this issue Nov 25, 2014 · 40 comments

Comments

@pgundlach
Copy link

Using the following file with version d6607c5 I get a broken Excel file. How can I write multiple rows?

package main

import (
    "fmt"
    "github.com/tealeg/xlsx"
)

func main() {
    var file *xlsx.File
    var sheet *xlsx.Sheet
    var row *xlsx.Row
    var cell *xlsx.Cell
    var err error

    file = xlsx.NewFile()
    sheet = file.AddSheet("Sheet1")
    row = sheet.AddRow()
    cell = row.AddCell()
    cell.Value = "I am a cell!"

    row = sheet.AddRow()
    cell = row.AddCell()
    cell.Value = "I am a cell!"

    err = file.Save("MyXLSXFile.xlsx")

    if err != nil {
        fmt.Printf(err.Error())
    }
}
@tealeg
Copy link
Owner

tealeg commented Dec 1, 2014

Hi, sorry for the delayed response.

I can't reproduce this error locally. Could you please take the following steps:

  1. Update to the latest version of the code from github and confirm that you can reproduce the problem.
  2. If you can reproduce the problem let me know here and also indicate what operating system and spreadsheet you are using to open the file, and provide any additional information (such as error messages) that you have.

@pgundlach
Copy link
Author

I have tried the code above with rev c604be9 with the same result.

The software is Excel 2011 for Mac. Double clicking on the file: Excel asks me to repair the document. After repairing the document, I can see the two cells as expected. LibreOffice does not complain and opens the document without problem.

This is on Mac (newest version / Yosemite) and Excel 2011 (14.4.6, latest updates)

@pgundlach
Copy link
Author

The result is at:
http://download.speedata.de/private/MyXLSXFile.xlsx (for now)

@tealeg
Copy link
Owner

tealeg commented Dec 2, 2014

@pgundlach that file opens correctly in LibreOffice. Can you tell me which program you're using to open it and what message it gives you? I have neither Windows nor Mac OS available to me for testing.

@pgundlach
Copy link
Author

Did you perhaps miss #63 (comment) ?

This is on Mac (newest version / Yosemite) and Excel 2011 (14.4.6, latest updates)

I only have the German localized version.
bildschirmfoto 2014-12-02 um 12 09 09

Which means 'Excel could not open myfile.xlsx, because some contents is not readable. Would you like to open and repair the document?'

@tealeg
Copy link
Owner

tealeg commented Dec 2, 2014

@pgundlach oh sorry, yes I didn't see the previous comment!

As it happens I can read German - I'm in Hannover! ;-)

@tealeg
Copy link
Owner

tealeg commented Dec 2, 2014

@pgundlach OK, I'll compare the file you linked above to the output from your code and see what has changed!

@tealeg
Copy link
Owner

tealeg commented Dec 2, 2014

@pgundlach the file you linked seems to be identical to the output from the program - could you save the repaired version and let me have it?

@pgundlach
Copy link
Author

@tealeg Here it is: http://download.speedata.de/private/repaired.xlsx (will be taken offline in a month or so)

@tealeg
Copy link
Owner

tealeg commented Dec 2, 2014

@pgundlach great, I've got it, thanks!

@tealeg
Copy link
Owner

tealeg commented Dec 2, 2014

@pgundlach - ok, it seems the difference is a single file "theme.xml" which doesn't exist at all in the original version of the document. Very useful information.

@tealeg
Copy link
Owner

tealeg commented Dec 2, 2014

@pgundlach OK, can you try again using the current version of master - the output should now match your repaired version with regards to the theme1.xml file. There's still no style information here (which could indeed be a problem), but that will start to get resolved soon.

@pgundlach
Copy link
Author

@tealeg I've updated to rev 7260ea8 - but I get the same error. I am sorry I can't help you with more details.

@tealeg
Copy link
Owner

tealeg commented Dec 2, 2014

@pgundlach thanks. What you're doing is very helpful and I'm grateful for your patience.

I've just landed the style output, which is the largest chunk of what differs between xlsx output and that from, for example LibreOffice. I've tested the output in LibreOffice, so it might help with your situation. If you get a chance to test again with what's now on the master branch I'd be interested in the results.

@yjzhg
Copy link

yjzhg commented Dec 5, 2014

i have the same question....

@pgundlach
Copy link
Author

Using 339e2d2 I still get a broken Excel file (same error as above)

@tealeg
Copy link
Owner

tealeg commented Dec 5, 2014

@pgundlach OK. I'll keep working to make the output match that of known good files.

@tealeg
Copy link
Owner

tealeg commented Dec 5, 2014

@yjzhg if you could provide versions of the file your using before and after it has been repaired that would be helpful.

@nadoo
Copy link

nadoo commented Dec 8, 2014

Excel 2013 on windows 8.1, when opening the file , the error is: can not load /xl/styles.xml (xml syntax error);

@tealeg
Copy link
Owner

tealeg commented Dec 8, 2014

@nadoo - that's very helpful, thank you!

@sangeethay
Copy link

@tealeg can you strip the whitespace outside of <t> tags? Editing the xl/sharedStrings.xml file manually and removing the newlines and indentation seems to do the right thing

@tealeg
Copy link
Owner

tealeg commented Dec 10, 2014

@sangeethay that's extremely interesting. The formatting is automatically generated by xml.MarshallIndent - I do that to make it easier to read the resultant XML. In principal it shouldn't cause problems, but I'd be happy to find out. I'll let you know when I've made the change.

@tealeg
Copy link
Owner

tealeg commented Dec 10, 2014

@sangeethay OK, if you pull from master now you should get a version of the code that doesn't add indentation or other unecessary whitespace to the file.

@pgundlach
Copy link
Author

Using a44cb51 - I still get the same problem.

@zachyf
Copy link

zachyf commented Dec 19, 2014

Hey Geoff, thanks for the great work! The reader's been super helpful. I'm running into the same corruption issue when I write files. I've tried to open them on both mac and windows versions of excel. Do you have any further insight into what's going on? I'd be happy to try and help.

@tealeg
Copy link
Owner

tealeg commented Dec 19, 2014

@zachyf Thanks for the kind words.

As far as I can tell there's a number of small issues. The plan, as of now, is to work towards matching known good output (and reading the standard so that I can be sure I fully understand what is being expressed). It could be quite sometime until this functionality is working perfectly - particularly as I seem to need more control over the generated XML than encoding/xml's Marshal gives me.

I think I've mentioned it before, but I don't actually have Excel here so I'm unable to see the problems myself. LibreOffice parses the output without error.

@zachyf
Copy link

zachyf commented Dec 19, 2014

Perfect, thanks for the information. I'll compare the program's output against Excel output over the next few days. l'll keep you posted if I can find discrepancies that explain the apparent corruption.

@nadoo
Copy link

nadoo commented Dec 20, 2014

have a try with https://github.com/nadoo/xlsx , in my condition, the output file is ok to be opened by excel 2013.

@tealeg
Copy link
Owner

tealeg commented Dec 20, 2014

@nadoo ok, that looks interesting! By removing the cellStyleXf and cellXf you're presumably getting an output that doesn't have any styling at all applied to the cells? The other things you've disabled seem like easy targets for removal.

I've been pondering doing something similar on the mainline and making style support in output a separate branch. At least so we can have working, but minimal output.

@nadoo
Copy link

nadoo commented Dec 31, 2014

@tealeg yes, I think make a data output only version is a good idea, sometimes we just need to send some data to another, styles are not necessary.

@kardianos
Copy link

@tealeg You can use MS One Drive (Free to use, can use from Chrome on Linux) to test XLSX output.

@tealeg
Copy link
Owner

tealeg commented Jan 15, 2015

Thanks @kardianos, I actually started doing just that over Christmas! It's still not perfect, in that it doesn't tell me anything about what is wrong with the files, but at least it gives me some indication.

@tealeg tealeg mentioned this issue Jan 20, 2015
@BrianMMcClain
Copy link

FWIW, seeing the same on:

OS X 10.10.2
Microsoft Excel for Max 2011 v14.4.7 (141117)

And seeing the following logs after a repair

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"><logFileName>Repair Result to test 06308.xml</logFileName><summary>Errors were detected in file 'Macintosh HD:Users:brianmcclain:Documents:test.xlsx'</summary><removedParts summary="Following is a list of removed parts:"><removedPart>Replaced Part: /xl/worksheets/sheet1.xml with XML error.  Load error. Line 2, column 698.</removedPart></removedParts><removedRecords summary="Following is a list of removed records:"><removedRecord>Removed Records: Document Theme from /xl/workbook.xml (Workbook)</removedRecord></removedRecords><repairedRecords summary="Following is a list of repairs:"><repairedRecord>Repaired Records: Format from /xl/styles.xml (Styles)</repairedRecord></repairedRecords></recoveryLog>

And the result is an empty file. I took a look at @nadoo's changes, however while the data was showing, unfortunately in my specific use case, styling is a requirement. I'd be happy to provide any additional information if you'd like!

@lattwood
Copy link

I'm getting the same issue as @BrianMMcClain

@yupengfei
Copy link

I got the same error again,but what I have is a Chinese version. What can I do to help?

@ShawnMilo
Copy link
Collaborator

@yupengfei The Chinese version of what, OSX?

I confirmed that, using the file generated by the original post in this issue, I get an error:

"MyXLSXFile.xlsx" can't be opened.

No useful errors, though.

@tealeg
Copy link
Owner

tealeg commented May 7, 2015

I believe that on Mac OS X it's possible to see more information about failures in log files that Excel writes. Presumably in ~/Library/Logs/ someplace, but I don't recall all the details.

@yupengfei
Copy link

@ShawnMilo Sorry to reply late, we use Chinese version of MS office. I have tried OSX, it can not open generated xlsx files. But MS Office can open it normally after repair it. MS Office privide some error message in Chinese.

@ShawnMilo
Copy link
Collaborator

@tealeg:

Thanks for the tip. I couldn't get any debug info from Numbers, but I did a lot of playing with two spreadsheets, (one working and one non-working) by copying bits from one to the other. I did find a solution.

In workbook.xml, in the workbook tag, there is this: xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main". It turns out that adding this into the workbook tag as well fixes the file:

xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"

I'm having trouble finding where to add this to the code (grep brings back too many selections). I can narrow it down, but I wanted you to have a look in case this is a bad idea for some reason.

@tealeg
Copy link
Owner

tealeg commented May 11, 2015

@ShawnMilo I think the issue is that the normal XML marshalling in Go doesn't allow for additional namespaces to be defined in the root element of the document. The default namespace is set up by the tags on the XMLName field of the structs we marshal to/from.

Because of this restriction, I believe the code currently has cases where we set the namespaces on the elements in the document, that belong to namespaces other than the default, directly. That should be valid XML, but it may not be good enough for Excel. There does seem to be some work going on to address this that may turn up in Go 1.5. Personally I'd like us to stick to using Go's XML marshalling, but if needs be I wouldn't be opposed to temporarily pulling in development versions of the library if it really is useful. Obviously with a view that these would be removed once the functionality is generally available.

ShawnMilo added a commit that referenced this issue May 11, 2015
Uses a string replacement to bypass current limitation on
encoding/xml.
ShawnMilo added a commit that referenced this issue May 11, 2015
Uses a string replacement to bypass current limitation on
encoding/xml.
ShawnMilo added a commit that referenced this issue May 11, 2015
tealeg pushed a commit that referenced this issue Apr 1, 2016
Uses a string replacement to bypass current limitation on
encoding/xml.
tealeg pushed a commit that referenced this issue Apr 1, 2016
tealeg pushed a commit that referenced this issue Jan 14, 2020
Uses a string replacement to bypass current limitation on
encoding/xml.
tealeg pushed a commit that referenced this issue Jan 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests