Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust hex/octal string decoding #627

Merged
merged 1 commit into from
Aug 7, 2023
Merged

Conversation

GreyWyvern
Copy link
Contributor

Add a second check to be sure a string is hexadecimal before applying the pack() function. This ensures we avoid illegal hex digit and resolves #499

PdfParser currently only decodes triple digit escaped octal codes, when single, double and triple digits are all allowed. See PDF Reference 1.7 Section 3.2 Objects (page 55): https://ia801001.us.archive.org/1/items/pdf1.7/pdf_reference_1-7.pdf

Modify the regexp to search for escaped octal codes from one to three digits, and exclude escaped backslashes. In sections of text that aren't escaped octal codes, un-escape backslashes and parentheses as described in PDF Reference 1.7 Section 3.2 Table 3.2. This resolves #470

Adjust the unit test testDecodeOctal() to escape the valid octal code \\1 so that the output matches the existing expected value AB \199.

Add a second check to be sure a string is hexadecimal before applying the `pack()` function. This ensures we avoid `illegal hex digit` and resolves smalot#499

PdfParser currently only decodes triple digit escaped octal codes, when single, double and triple digits are all allowed. See PDF Reference 1.7 Section 3.2 Objects (page 55): https://ia801001.us.archive.org/1/items/pdf1.7/pdf_reference_1-7.pdf

Modify the regexp to search for escaped octal codes from one to three digits, and exclude escaped backslashes. In sections of text that aren't escaped octal codes, un-escape backslashes and parentheses as described in PDF Reference 1.7 Section 3.2 Table 3.2. This resolves smalot#470

Adjust the unit test `testDecodeOctal()` to escape the valid octal code `\\1` so that the output matches the existing expected value `AB \199`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to read Type H: illegal hex digit from the Adobe XML Form Module Encoding issues with backslashes
2 participants