-
Notifications
You must be signed in to change notification settings - Fork 541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unwanted spaces in Content #528
Comments
Hey. Just to be sure, can you try again with PDFParser v2.2.0? |
Hey @k00ni, the same happens using version 2.2.0 |
@rubenvanerk maybe you have any thoughts? |
Sorry, can't help you here. |
@k00ni I know you may be slow responding, but please let us know if you have any ideas or suggestions we could try. I appreciate your help in advance. |
Sorry, I can't help you here, I would have written you already. The only idea I have is to check the code part which uses the |
The issue with spaces is not solved on latest version (2.2.1) either. You need to apply the workaround:
|
2.12.2.0Description:
Hey guys, thanks for the hard work to keep this great library up to date.
Unfortunately, we are having one strange issue parsing a file, we tried adjusting the config like this as @rubenvanerk mentioned in a few issues (in fact the FontSpaceLimit variable solves multiple issues with the char separation).
Expected output & actual output
But we cannot get rid of some unwanted empty spaces for example with:
And more importantly in the PDF section titles we have:
Additionally, this one is not the expected output too:
PDF input
IL-Field-Guide-final-online.pdf
This may be not necessarily an issue, but we are suspecting that for some reason the PDF has a space within the conflicting phrases/sentences.
In any case, we are starting to use the library, we modified a few things on the vendors folder trying to fix the issue, but we are going out of ideas now.
The text was updated successfully, but these errors were encountered: