You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Docling does not recognize that word that are at the end on a line with a - are often the same word or strongly connected to the word in the next line.
Since my professor did this a lot in the script I am currently working through I wrote a script for myself that checks all the text ending in -, but I think this is worth looking into.
Here is an example:
Docling correctly recognizes part of it as the caption (I exported to json to get an idea what is going on):
But the second part is read as normal text not caption and is not connected.
The text was updated successfully, but these errors were encountered:
Docling does not recognize that word that are at the end on a line with a - are often the same word or strongly connected to the word in the next line.
Since my professor did this a lot in the script I am currently working through I wrote a script for myself that checks all the text ending in -, but I think this is worth looking into.
Here is an example:
Docling correctly recognizes part of it as the caption (I exported to json to get an idea what is going on):
But the second part is read as normal text not caption and is not connected.
The text was updated successfully, but these errors were encountered: