Fix attempt for unicode bug #1137 #1772

remram44 · 2014-04-28T15:01:46Z

New pull request because Github just doesn't get rebases. Original: #1179 Issue: #1137

2013-08-31, remram44:

This should probably be made to work on Python 3 first... I see nothing in six.py to help, though.

~~Still need tweaking to work with Python 3.~~

This won't work on Python 3, and is probably not required there at all. It makes sure that all entries from complete_log are bytestrings when join()ing, before writing to the file. The issue (github pypa#1137) was that a single unicode object on the list would cause all the bytestrings to be converted to unicode, probably through the 'ascii' encoder (which would then fail on 8-bit characters).

remram44 · 2014-04-28T15:03:36Z

The way PIP is setup, the log file expects bytestrings on Python 2 (which is why I added to_utf8()) and unicode on Python 3. Make up your mind people.

pfmoore · 2014-04-28T15:08:10Z

There is no unicode builtin in Python 3, so this PR needs work to be applicable on Python 3 before it can be merged.

dstufft · 2014-05-03T11:44:37Z

Two things:

Can we have a better name than to_utf8, it only actually does that on Python 2.x, maybe to_native_str?
Can you write a test that exercises to_utf8 on it's own, and then another test that fails without this change to the Command object, but passes with the change?

remram44 · 2014-05-03T20:03:58Z

Surely you can change the function name yourself...

I filed this patch 9 months ago for something that I think is a very serious bug. The fact that it prevents the log from reporting anything useful afterwards make it all the more important. Merge it if you want, but I personally don't have time to update your tests (any more).

dstufft · 2014-05-03T20:42:19Z

Sure it's not a nice bug and it'd be good to have it fixed. There are a number of bugs like that and limited time, and while you may have filed the original patch 9 months ago, it didn't work on Python 3.x until 5 days ago. There is a long laundry list of things to do and this patch isn't merge ready as it is. A maintainer (myself included) may get around to it but we're all pretty busy and this isn't hardly the only bug.

remram44 · 2014-08-18T05:49:56Z

Tests without the patch (failure on Python 2, that's the point): https://travis-ci.org/pypa/pip/builds/32825248

remram44 · 2014-08-18T05:54:24Z

Two things:

There were 3 things ;-)

Can we have a better name than to_utf8, it only actually does that on Python 2.x, maybe to_native_str?

1bede5b renames to to_native_str_type()

Can you write a test that exercises to_utf8 on it's own,

added with 93d4d6b

and then another test that fails without this change to the Command object, but passes with the change?

added with d017938

Ivoz · 2014-08-22T00:52:14Z

Compatibility code like to_native_str_type() should really live in pip/compat.py. You can already see a native_str(s) function exists, although it could be detable whether it is complete or not.

I also remember this commit from @dholth attempting to fix somewhat the same issue, can you report it did not (or did not for all cases)?

remram44 · 2014-08-22T01:09:32Z

Looking at that patch, the basecommand.py part only affects Python 3 (sys.version_info[0] != 2) where the bug is unlikely to happen (because you shouldn't run into bytestring literals). At that revision, my acceptance test still fails without my fix.

Indeed that function native_str() looks awfully familiar (i.e. it is the same), I can probably rebase my fix to use it instead (and move my unittest to the right place).

But if I rebase, this is the last time. This is a one-year-old fix for a very serious bug that I am tired of dragging around every time you decide a name isn't right or the stars aren't aligned. Not that it matters though, since 1.6 probably won't happen this year... I'll probably do that over the week-end.

remram44 · 2014-09-02T01:53:01Z

Hmm, took a bit more than the week-end. Anyway, submitted #2012.

Ivoz · 2014-09-02T13:07:55Z

But if I rebase, this is the last time. This is a one-year-old fix for a very serious bug that I am tired of dragging around every time you decide a name isn't right or the stars aren't aligned. Not that it matters though, since 1.6 probably won't happen this year... I'll probably do that over the week-end.

Didn't get to saying it earlier @remram44 , but wanted to note that yes we are always appreciative of bugfixes! Unfortunately the encoding problems have been a very hard bugs to fix, having things mostly work with python 2 pip code, then more and more cases pop up with people having unicode somethings, and it breaking sporadically, but then also having to deal with making sure any fix works for both python 3 and python 2 (not easy task with encoding problems), and is of good quality, and hopefully has regression tests to make sure it actually fixes the existing failing use cases. There's been a number of attempts and not all of them up to scratch, but we want to make sure we try and get one in that actually does the job, along with the code quality standard that is demanded of all patches. That's essentially why we are very careful to merge in these fixes, even though we're appreciative of them coming from the community.

Makes to_utf8() compatible with Python 3

35327e5

remram44 changed the title ~~Fix attempt for unicode bug #1137 **PYTHON 2 ONLY** (rebase of #1179)~~ Fix attempt for unicode bug #1137 (rebase of #1179) Jun 18, 2014

remram44 changed the title ~~Fix attempt for unicode bug #1137 (rebase of #1179)~~ Fix attempt for unicode bug #1137 Jun 18, 2014

remram44 added 3 commits August 18, 2014 00:28

Renames to_utf8() to to_native_str_type()

1bede5b

Adds tests for to_native_str()

93d4d6b

Acceptance test for to_native_str() in BaseCommand

d017938

dstufft mentioned this pull request Sep 1, 2014

Switch from using a homegrown logging solution to the stdlib logging module #2008

Merged

10 tasks

remram44 mentioned this pull request Sep 2, 2014

Fix attempt for unicode bug #1137 (v3) #2012

Merged

remram44 closed this Sep 2, 2014

remram44 deleted the unicode-log branch January 14, 2015 22:11

lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 4, 2019

lock bot locked as resolved and limited conversation to collaborators Jun 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix attempt for unicode bug #1137 #1772

Fix attempt for unicode bug #1137 #1772

remram44 commented Apr 28, 2014

remram44 commented Apr 28, 2014

pfmoore commented Apr 28, 2014

dstufft commented May 3, 2014

remram44 commented May 3, 2014

dstufft commented May 3, 2014

remram44 commented Aug 18, 2014

remram44 commented Aug 18, 2014

Ivoz commented Aug 22, 2014

remram44 commented Aug 22, 2014

remram44 commented Sep 2, 2014

Ivoz commented Sep 2, 2014

Fix attempt for unicode bug #1137 #1772

Fix attempt for unicode bug #1137 #1772

Conversation

remram44 commented Apr 28, 2014

remram44 commented Apr 28, 2014

pfmoore commented Apr 28, 2014

dstufft commented May 3, 2014

remram44 commented May 3, 2014

dstufft commented May 3, 2014

remram44 commented Aug 18, 2014

remram44 commented Aug 18, 2014

Ivoz commented Aug 22, 2014

remram44 commented Aug 22, 2014

remram44 commented Sep 2, 2014

Ivoz commented Sep 2, 2014