-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing characters in extracted body #1360
Comments
I was about to provide a test case for the issue I saw but then I found |
It is marked as an expected failure because the bug is known and we did not yet fix it. I created this test after the bug report was issued in order to help us fix it. See the #1359 I mentioned above. If you want to provide more test cases that is very much appreciated. So if you have any other combinations of encoding and charset I am very interested. |
Nope, that is exactly the combination that fails for me as well. |
@lucc: Hey. I think you've been the only one to look into this issue, but I guess it's stalled a bit? We're holding off on packaging 0.8 for debian due to this issue. We would probably cherry-pick a patch for this back to the 0.8 release if a fix is found. |
The problem bisects back to commit 176cffc ("refactor alot.db.utils.remove_cte", 2018-12-04) which claims to refactor and make the code more lenient. In fact, it does one more thing: it changes the fallback for the case of decoding errors to In general, I'd suggest to refactor in one commit and change behaviour in another one... |
I guess this can be closed then? |
Yes, it is fixed by PR #1375. |
For some content transfer encodings and charsets some characters are missing in the extracted body. An example can be seen in the test added in #1359. As that test demonstrates the problem seems to be somewhere below
alot.db.utils.extract_body()
. We should try to find more relevant combinations of encoding and charset, add tests and ultimatly fix it :)Software Versions
The text was updated successfully, but these errors were encountered: