Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pip when using Py2 on Windows with non-ASCII hostname or username. #3970

Closed
wants to merge 8 commits into from

Conversation

pekkaklarck
Copy link
Contributor

Two related problems are fixed:

  • Previously non-ASCII characters in hostname blew up pip install
    completely. This is rather severe.
  • Non-ASCII characters in username crashed printing help text. Not
    so bad but definitely annoying.

Non-ASCII usernames are pretty common in non-English speaking
countries. That also makes non-ASCII hostnames pretty common, because
Windows creates hostname based on the username by default.

The reason for these failures was that user_cache_dir returned
Unicode on Windows and bytes elsewhere, and rest of the codebase was
expecting paths to be bytes.

It could be argued that pip should always use Unicode internally, but
that would require a lot more changes and fixing also some of the
vendored dependencies. We can also argue, like PEP-519 does, that
paths should not be considered to be strings at all, and thus the "all
Unicode internally" guideline wouldn't apply in this case.

Fixes #3463. See that issue for more discussion and details.

Two related problems are fixed:
- Previously non-ASCII characters in hostname blew up `pip install`
  completely. This is rather severe.
- Non-ASCII characters in username crashed printing help text. Not
  so bad but definitely annoying.

Non-ASCII usernames are pretty common in non-English speaking
countries. That also makes non-ASCII hostnames pretty common, because
Windows creates hostname based on the username by default.

The reason for these failures was that `user_cache_dir` returned
Unicode on Windows and bytes elsewhere, and rest of the codebase was
expecting paths to be bytes.

It could be argued that pip should always use Unicode internally, but
that would require a lot more changes and fixing also some of the
vendored dependencies. We can also argue, like PEP-519 does, that
paths should not be considered to be strings at all, and thus the "all
Unicode internally" guideline wouldn't apply in this case.

Fixes pypa#3463. See that issue for more discussion and details.
@xavfernandez
Copy link
Member

@pekkaklarck thanks for your patch.
You have a small pep8 error and could you add a changelog in CHANGES.txt ?
@pfmoore could you confirm that this fix works ?

@xavfernandez xavfernandez added this to the 8.2 milestone Sep 29, 2016
@pfmoore
Copy link
Member

pfmoore commented Sep 29, 2016

@xavfernandez I don't have a non-ASCII account set up to test this right now, nor do I have a py2.7 install to hand. I'll try to find some time to set up a VM to test it at some point. But I'd prefer it if we could add a test to the test suite to ensure this remains fixed. It could easily enough be done by mocking _get_win_folder. Make sure it's run by the appveyor tests (which only run unit tests not integration tests).

@pekkaklarck
Copy link
Contributor Author

@xavfernandez cool! Already fixed the pep8 issue with docs. Also updated CHANGES.txt but apparently forgot to update my clone first. Will fix the problem ASAP.

Do you prefer all changes to be squashes or do you use the newish GitHub functionality to do that when merging?

@pekkaklarck
Copy link
Contributor Author

@pfmoore: If you have a Windows machine available, you only need to temporally set its hostname to something like läppäri (laptop in Finnish).

I agree it would be good to add a unit test. If mocking turns out to be hard, would it be enough to add a test that user_cache_dir always returns bytes on Python 2? That test could be executed on all platforms.

@xavfernandez
Copy link
Member

Squashed or not, that's fine but @pfmoore is right: an unit test to prevent future regression would definitely close this issue.

@pekkaklarck
Copy link
Contributor Author

Getting too late here and heading to bed. Some questions:

  1. What should be done for this new failure on Travis:

    ./pip/utils/appdirs.py:41:37: F821 undefined name 'unicode'
    ERROR: InvocationError: '/home/travis/build/pypa/pip/.tox/py3pep8/bin/flake8 .'

    The line causing the error is if PY2 and isinstance(path, unicode) i.e. unicode there is never seen by Python 3.

  2. How do I run unit tests? The docs I found recommend tox -e py33, but that took ages, caused a lot of failures, and eventually crashed with this error:

    ERROR: InvocationError: '/home/peke/Devel/pip/.tox/py33/bin/py.test --timeout 300'

    The docs also mention running py.test directly, but both py.test unit or py.test unit/test_utils.py fail with these errors:

    ImportError: No module named scripttest
    ERROR: could not load /home/peke/Devel/pip/tests/conftest.py

    It would be awesome if someone knowing more about the test system would document it a bit more. Adding separate tests/README.rst, linked from the existing docs/development.rst, would make the docs easy to find for new contributors.

  3. Noticed _get_win_folder that causes the underlying problem by returning Unicode is called also by other functions in utils.appdirs. What should we do to them?

@xavfernandez
Copy link
Member

@pekkaklarck

@pekkaklarck
Copy link
Contributor Author

@xavfernandez

  1. Good to know about six.text_type. In my opinion PY2 and isinstance(path, unicode) makes it more explicit that we mean Unicode than PY2 and isinstance(path, text_type), but I'll change it if there's no other easy way to please flake8 in this case.

  2. I tested running unit tests both with -e33 and -e27 in the develop branch without any of my changes, but both failed like this:

    INTERNALERROR> ZeroDivisionError: integer division or modulo by zero

    ================== 1 pytest-warnings, 1 error in 1.21 seconds ==================
    ERROR: InvocationError: '/home/peke/Devel/pip/.tox/py27/bin/py.test --timeout 300 -m tests/unit'
    ___________________________________ summary ____________________________________
    ERROR: py27: commands failed

    It's possible that I'm missing some dependencies, but the error isn't very helpful for a new contributor. I can post the full traceback if you want.

  3. There were no changes to parser.print_help(). Not sure what should I test there.

@xavfernandez
Copy link
Member

@pekkaklarck
Copy link
Contributor Author

Sorry, didn't understand what you mean. Notice that I don't know appveyor at all and don't have much experience from tox either. I would expect tests to work on a fresh clone without extra configuration or possibly configuration to be explicitly documented somewhere. Notice also that so far I've tried running tests on my primary Linux machine.

Couldn't unfortunately run this locally due to unit test execution
failing totally. This test is pretty simple so hopefully it works
anyway.

Would like to add a separate test to make sure return type is str also
on Windows when _get_win_folder returns Unicode. Adding a test with
mocking and skipping would be easier if I could run it locally too.
@pfmoore
Copy link
Member

pfmoore commented Sep 30, 2016

@pekkaklarck What @xavfernandez was getting at is that to run the tests on Windows, you need tox installed and then you can do tox -- -m unit -n 8

@pekkaklarck
Copy link
Contributor Author

@pfmoore: I see. I haven't yet even tried running tests on Windows because the whole execution fails on my primary Linux development machine for a pretty uninformative ZeroDivisionError. On that machine I have tox and py.test installed.

@pekkaklarck
Copy link
Contributor Author

All the checks are now passing, there's a unit test for the fix and CHANGES.txt has been updated. Is there something else you'd like me to do or is this good to be merged?


def test_return_path_as_str(self):
"""Test that path is bytes on Python 2 and Unicode on Python 3."""
assert isinstance(user_cache_dir('test'), str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would go with something like:

@pytest.mark.skipif("sys.platform != 'win32'")
@patch('pip.utils.appdirs._get_win_folder', _get_win_folder)
def test_your_test(self, _get_win_folder):
    _get_win_folder.return_value = u"läppäri"
    assert isinstance(user_cache_dir('test'), str)
    # Test against regression #3463
    from pip import create_main_parser
    create_main_parser().print_help()  # This should not crash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking something similar but didn't want to add so complex test when I cannot run tests locally.

@pekkaklarck
Copy link
Contributor Author

Are all tests run automatically on Windows and can those machines be configured? If yes, changing user and host names to non-ASCII might be the best way to test this.

@pfmoore
Copy link
Member

pfmoore commented Oct 3, 2016

I'd be a strong -1 on changing machine config in tests (even if it's possible) as that would impact people running tests on their development machine.

As I noted, the pip unit tests are run automatically on appveyor. I'd add a unit test like @xavfernandez suggested, and if you can't run it locally, add it to the PR and watch the appveyor run to confirm it worked. (Probably start by adding the test but disabling the fix, to ensure that the test fails as expected on the current code - standard test-driven sort of approach). That should be a perfectly reasonable way for someone without access to Windows to test a Windows-specific change like this.

@pekkaklarck
Copy link
Contributor Author

pekkaklarck commented Oct 5, 2016

I tried running unit tests on Windows. I installed tox and executed tox -- -m unit -n 8 as @pfmoore suggested, but end results were pretty much as bad as on my Linux machine. You can see the output I got in the attached windows-output.txt file.

Could you @xavfernandez please add the test you proposed yourself? It is a bit complicated and committing it without being able to run tests locally wouldn't be nice. I could just check the CI results like @pfmoore suggested, but that would likely require several iterations and took lot of time waiting for results.

Notice that the test I already added verifies the actual fix. It ought to thus be safe to merge this PR also without new tests for print_help.

@xavfernandez
Copy link
Member

Well, I'm not using windows so I will end up doing what @pfmoore suggested and test it on appveyor.
If you can't find the time, I'll try to work on that when I find the time (a priori not before october 15th)

@pfmoore
Copy link
Member

pfmoore commented Oct 5, 2016

Alternatively, if I get some time I'll try to look at it. At least I have the option of running the tests locally. I do have very little time right now, though, so no promises, sorry.

@pekkaklarck
Copy link
Contributor Author

It would be awesome if either of you could write the additional test. If it turns out you don't have time before 8.2 release, I can take a look at myself. Without much experience from py.test and Mock, and not being able to run tests locally, it would be complicated but doable anyway.

pfmoore added a commit to pfmoore/pip that referenced this pull request Oct 5, 2016
pfmoore added a commit to pfmoore/pip that referenced this pull request Oct 5, 2016
@pfmoore
Copy link
Member

pfmoore commented Oct 5, 2016

I've just pushed #4000 which includes a test as suggested by @xavfernandez and the changes from this PR. The test failed on 2.7 when run with master on my PC, and then succeeded when run with this PR. (It succeeded with or without this PR on 3.5).

@pekkaklarck could you take a look at my PR and confirm it looks OK to you? I've manually copied your code changes and I'd appreciate it if you could confirm you're happy with it. I don't want to misrepresent your work.

@pekkaklarck
Copy link
Contributor Author

Changes in #4000 look good to me. Only issue I saw was that CHANGES.txt still refers to this PR, not #4000.

Thanks @pfmoore for applying the fix and @xavfernandez for help! Already looking forwards for pip 8.2 and being able to tell users encountering this problem that python -m pip install --no-cache-dir --upgrade pip will fix it. I assume that the fix will be included in Python 2.7.13 whenever that is released too.

pfmoore added a commit that referenced this pull request Oct 6, 2016
Fix pip when using Py2 on Windows with non-ASCII hostname or username (originally #3970 from @pekkaklarck)
@pfmoore
Copy link
Member

pfmoore commented Oct 6, 2016

Closed via #4000 - thanks @pekkaklarck for the fix!

@pfmoore pfmoore closed this Oct 6, 2016
@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 3, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problems on Windows with username or hostname containing non-ASCII characters
3 participants