-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[text selection] Add the whitespaces present in the pdf in the text chunk #14703
Conversation
/botio test |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.241.84.105:8877/09da1adb41bcc23/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.193.163.58:8877/fcc0c624e3999e4/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/09da1adb41bcc23/output.txt Total script time: 23.68 mins
Image differences available at: http://54.241.84.105:8877/09da1adb41bcc23/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/fcc0c624e3999e4/output.txt Total script time: 26.57 mins
Image differences available at: http://54.193.163.58:8877/fcc0c624e3999e4/reftest-analyzer.html#web=eq.log |
There seem to be a regression in |
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.193.163.58:8877/68545675500c364/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.241.84.105:8877/b10f5755e6f1605/output.txt |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/68545675500c364/output.txt Total script time: 0.72 mins |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/b10f5755e6f1605/output.txt Total script time: 23.67 mins
Image differences available at: http://54.241.84.105:8877/b10f5755e6f1605/reftest-analyzer.html#web=eq.log |
/botio test |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.241.84.105:8877/d9e6d4c86dda2b0/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.193.163.58:8877/1b1326bcb9ad933/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/d9e6d4c86dda2b0/output.txt Total script time: 23.58 mins
Image differences available at: http://54.241.84.105:8877/d9e6d4c86dda2b0/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/1b1326bcb9ad933/output.txt Total script time: 26.27 mins
Image differences available at: http://54.193.163.58:8877/1b1326bcb9ad933/reftest-analyzer.html#web=eq.log |
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.193.163.58:8877/199de5196ccc849/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.241.84.105:8877/eb325a7abec3863/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/eb325a7abec3863/output.txt Total script time: 23.50 mins
Image differences available at: http://54.241.84.105:8877/eb325a7abec3863/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/199de5196ccc849/output.txt Total script time: 27.02 mins
Image differences available at: http://54.193.163.58:8877/199de5196ccc849/reftest-analyzer.html#web=eq.log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me, with the final comment addressed.
…hunk - it aims to fix issue mozilla#14627; - the basic idea of the recent text refactoring was to only consider the rendered visible whitespaces. But sometimes, the heuristics aren't correct and although some whitespaces are in the text stream they weren't in the text chunks because they were too small. Hence we added some exceptions, for example, we always add a whitespace when it is between two non-whitespace chars but only when in the same Tj. So basically, this patch removes the constraint to have the chars in the same Tj (in using a circular buffer to save the two last chars) but don't add a space when the visible space is really too small (hence `NOT_A_SPACE_FACTOR`).
/botio test |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.241.84.105:8877/4930c9191c54fdb/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.193.163.58:8877/430054015e0bfca/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/4930c9191c54fdb/output.txt Total script time: 23.47 mins
Image differences available at: http://54.241.84.105:8877/4930c9191c54fdb/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/430054015e0bfca/output.txt Total script time: 26.74 mins
Image differences available at: http://54.193.163.58:8877/430054015e0bfca/reftest-analyzer.html#web=eq.log |
/botio makeref |
From: Bot.io (Linux m4)ReceivedCommand cmd_makeref from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.241.84.105:8877/46f6159d0affc6f/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_makeref from @Snuffleupagus received. Current queue size: 1 Live output at: http://54.193.163.58:8877/ae0e782307050b0/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.241.84.105:8877/46f6159d0affc6f/output.txt Total script time: 20.62 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.193.163.58:8877/ae0e782307050b0/output.txt Total script time: 20.98 mins
|
But sometimes, the heuristics aren't correct and although some whitespaces are in the text stream
they weren't in the text chunks because they were too small. Hence we added some exceptions, for example,
we always add a whitespace when it is between two non-whitespace chars but only when in the same Tj.
So basically, this patch removes the constraint to have the chars in the same Tj
(in using a circular buffer to save the two last chars) but don't add a space when the visible space is really
too small (hence
NOT_A_SPACE_FACTOR
).