Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion errors with tables #109

Closed
larsga opened this issue Feb 17, 2023 · 5 comments
Closed

Conversion errors with tables #109

larsga opened this issue Feb 17, 2023 · 5 comments
Assignees

Comments

@larsga
Copy link
Collaborator

larsga commented Feb 17, 2023

With the changes in PR #108 Word starts complaining about tables, and has to repair converted Word documents. The error reporting looks like this:

image

The user reports that "Interestingly, the new tables look better than the old because it seems my border control is being respected by the new JAR ... unless, perhaps, that is related to why Word seems to find the tables in error."

Very likely this is caused by the new code that traverses the contents of the table cell, or perhaps by differences in how garbage content added by POI is deleted.

@larsga larsga self-assigned this Feb 17, 2023
@drmacro
Copy link
Owner

drmacro commented Feb 19, 2023

The issue as far as I can tell is that Word expects an empty paragraph following a nested table.

If I let Word repair the test doc, it adds an empty paragraph after the table within the cell.

Removing that paragraph makes the doc corrupted, restoring makes it open normally.

@drmacro
Copy link
Owner

drmacro commented Feb 19, 2023

Confirmed that if I create a nested table manually in Word it adds a trailing empty paragraph.

I don't see anything in the OOXML docs that suggest this is required but it seems to be how Word works.

@drmacro
Copy link
Owner

drmacro commented Feb 19, 2023

See #110

@larsga
Copy link
Collaborator Author

larsga commented Feb 23, 2023

The cause of the problem (found by narrowing down the failing example) is empty table cells in the input.

With the old code these output:

      <w:tc>
        <w:tcPr><w:tcW w:type="pct" w:w="2250"/></w:tcPr>
        <w:p/>
      </w:tc>

With the new code they output:

<w:tc>
  <w:tcPr><w:tcW w:type="pct" w:w="2250"/></w:tcPr>
</w:tc>

Clearly Word doesn't accept the absence of a paragraph there.

larsga added a commit to larsga/wordinator that referenced this issue Feb 23, 2023
@drmacro
Copy link
Owner

drmacro commented Jan 15, 2024

Fixed here: #137

@drmacro drmacro closed this as completed Jan 15, 2024
larsga added a commit to larsga/wordinator that referenced this issue Jul 4, 2024
larsga added a commit to larsga/wordinator that referenced this issue Jul 4, 2024
@ekimbernow ekimbernow reopened this Aug 4, 2024
@drmacro drmacro closed this as completed in 9f69540 Aug 4, 2024
drmacro pushed a commit that referenced this issue Aug 5, 2024
…s with <p>

Signed-off-by: eliot.kimber <eliot.kimber@servicenow.com>
drmacro pushed a commit that referenced this issue Aug 5, 2024
…s with <p>

Signed-off-by: eliot.kimber <eliot.kimber@servicenow.com>
drmacro pushed a commit that referenced this issue Aug 5, 2024
* Fixes #133, #105: Set compatibity mode setting to turn off compatibility mode.

Signed-off-by: eliot.kimber <eliot.kimber@servicenow.com>

* Fixes #109: Incorporate fix from Lars Marius to ensure table cell ends with <p>

Signed-off-by: eliot.kimber <eliot.kimber@servicenow.com>

* WIP: Added multi-section test cases from Lars Marius

Signed-off-by: eliot.kimber <eliot.kimber@servicenow.com>

* Fixes #117: Last section handling from Lars Marius

Signed-off-by: eliot.kimber <eliot.kimber@servicenow.com>

---------

Signed-off-by: eliot.kimber <eliot.kimber@servicenow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants