Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pip's machinery for creating an isolated build environment so that user installed packages do not override the stdlib #8213

Closed

Conversation

abadger
Copy link
Contributor

@abadger abadger commented May 9, 2020

Because pip ships with an __main__.py file, one valid way to invoke it
is: python /usr/lib/python3.6/site-packages/pip/. This results in
__package__ being the empty string This, in turn, triggers the code to
add the pip wheel to sys.path. Unfortunately, while this is fine to do
when it's restricted to a wheel just containing pip, it isn't okay to do
when it is site-packages as that will mean that other things in
site-packages could end up overriding things in the stdlib
(coughenum34cough).

Check file extension to limit when we end up adding the extra path to
sys.path.

Fixed #8214

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

Working on adding a news file now... I just need to restore my pip to vanilla so that I can get tracebacks and such to post the issue to get the issue # for the news file

@uranusjr
Copy link
Member

uranusjr commented May 9, 2020

I don’t understand. How would imports work if /usr/lib/python3.6/site-packages is not added to sys.path? Wouldn’t it import another pip in the environment? Why not just run that pip instead?

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

@uranusjr The important thing here is the order. When python is invoked, sys.path normally includes site-packages but it includes it after the stdlib:

[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 -c 'import sys; print(sys.path)'
['', '/usr/lib64/python36.zip', '/usr/lib64/python3.6', '/usr/lib64/python3.6/lib-dynload', '/home/badger/.local/lib/python3.6/site-packages', '/usr/lib64/python3.6/site-packages', '/usr/lib/python3.6/site-packages']

The code that I'm changing in pip/__main__.py was supposed to only add a .whl file to sys.path if pip was being run directly from the whl. It was adding the whl file to the beginning of sys.path so that the pip code in the wheel would be run in preference to any pip install that was in another path.

The problem is that in some cases (See the issue I've now opened) it would add site-packages to the beginning of sys.path instead of a wheel. In those cases, a package in site-packages could replace a package in the stdlib with bad results.

Because pip ships with an __main__.py file, one valid way to invoke it
is: python /usr/lib/python3.6/site-packages/pip/.  This results in
__package__ being the empty string   This, in turn, triggers the code to
add the pip wheel to sys.path.  Unfortunately, while this is fine to do
when it's restricted to a wheel just containing pip, it isn't okay to do
when it is site-packages as that will mean that other things in
site-packages could end up overriding things in the stdlib
(*cough*enum34*cough*).

Check file extension to limit when we end up adding the extra path to
sys.path.

Fixes pypa#8214
@abadger abadger force-pushed the fix-pip-to-work-in-the-face-of-enum34 branch from 0526e85 to 0ac0fb3 Compare May 9, 2020 16:15
@pfmoore
Copy link
Member

pfmoore commented May 9, 2020

Is there a practical use case for this usage (that can't be handled another way)? I don't think this is a method of running pip that we particularly need to support, to be honest.

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

@pfmoore If you take a look at the linked issue, this is being done in pip's own code. I mention in the issue that the code could be modified in addition to this change but I'm not sure that it is correct to do so (the code in pip wants to make sure that it re-invokes the same install of pip as is currently running. More idiomatic ways of invoking pip like python -m pip won't have that guarantee. This code (the detection of wheels) is arguably buggy even if we find a better way to do it inside of pip itself.

@uranusjr
Copy link
Member

uranusjr commented May 9, 2020

The important thing here is the order. When python is invoked, sys.path normally includes site-packages but it includes it after the stdlib

So the same pip is already in your sys.path. Why not simply use python -m pip then? This kind of invocation is officially supported.

From my understanding, the python directory/containing/pip invocation method is a convinience method for pip developers to experiement things without needing to install the in-development pip somewhere. It is not intended for general usage.

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

@uranusjr I think you're on some sort of tangent now.

I'm not adding anything to sys.path. I'm taking things away. Right now, pip is incorrectly adding site-packages to the beginning of sys.path. My PR puts a stop to that while trying to maintain the behaviour (which is apparently desirable given the comments in the current code base) of adding a pip wheel onto sys.path if pip was invoked from a wheel.

It doesn't hurt me at all if you want to remove the whole section instead of my more targetted fix but I assume that there's a reason that that code was added in the first place.

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

@uranusjr Running from a wheel seems to be from @pfmoore's commit so I guess we're in the right place for him to decide if he thinks that code is still needed ;-)

commit 87418e05bb55ea64476653dd41e2f4db785f02c0
Author: Paul Moore <p.f.moore@gmail.com>
Date:   Mon Feb 17 18:40:39 2014 +0000

diff --git a/pip/__main__.py b/pip/__main__.py
index 92602ff5..81f3330e 100644
--- a/pip/__main__.py
+++ b/pip/__main__.py
@@ -1,4 +1,12 @@
 import sys
+
+# If we are running from a wheel, add the wheel to sys.path
+# This allows the usage python pip-*.whl/pip install pip-*.whl
+if __package__ == '':
+    import os
+    path = os.path.dirname(os.path.dirname(__file__))
+    sys.path.insert(0, path)

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

@uranusjr also, this is incorrect:

From my understanding, the python directory/containing/pip invocation method is a convinience method for pip developers to experiement things without needing to install the in-development pip somewhere.

It's also being used by pip itself here: https://github.com/abadger/pip/blob/master/src/pip/_internal/build_env.py#L171 That's the usage that I'm fixing.

It is not intended for general usage.

This is probably true but unfortunately it doesn't help. The problem is that pip itself is using this and therefore it has to work. It doesn't have anything to do with how I might decide to invoke pip myself. (Run the reproducer in #8214 to see that for yourself)

@pfmoore
Copy link
Member

pfmoore commented May 9, 2020

Running from a wheel seems to be from @pfmoore's commit so I guess we're in the right place for him to decide if he thinks that code is still needed ;-)

Running from a wheel is needed for get-pip.py.

But please don't assume I remember the reasoning behind a commit from 6 years ago...

@uranusjr
Copy link
Member

uranusjr commented May 9, 2020

This is probably true but unfortunately it doesn't help. The problem is that pip itself is using this and therefore it has to work.

I’m becoming increasingly confused. You say this has to work, but the PR, if I’m not mistaken, specifically makes it to not work? Quoting the commit message that introduced the line (464b2f3):

fix support for invoking pip using `python src/pip ...`

Ensure the subprocess call to pip for installing the PEP 518
build dependencies is using the same version of pip.

It is specifically designed to make sure a pip not already in sys.path can be correctly invoked. Your patch would cause this usage to fail, because it keeps the pip in src/pip out of sys.path, and thus not importable. The pip invocation will thus import the wrong pip (the one in site-packages, not the on at src/pip).

@felixfontein
Copy link

Maybe in that case the logic needs to be more clever, and insert path into sys.path before the first directory containing a subdirectory pip, or append at the end if none such is found?

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

@uranusjr How are you imagining that the pip in src/pip was loaded prior to re-invocation (when the user invokes it for the first itme)? It seems like in order for that to happen, it has to be on sys.path. So the question would be does the method of getting in on sys.path then not get sent to the re-invoked version? If so, is the best thing to do, then, to explicitly send it/ (For instance, by setting PYTHONPATH on the re-invoked version prior to re-invoking it?)

Anything that gets added to sys.path in __main__ will get run whenever anything invokes pip. A piece of path that gets added in build_env.py will only get added when pip re-invokes itself for the isolated build_env.

@uranusjr
Copy link
Member

uranusjr commented May 9, 2020

How are you imagining that the pip in src/pip was loaded prior to re-invocation (when the user invokes it for the first itme)? It seems like in order for that to happen, it has to be on sys.path.

$ git clone https://github.com/pypa/pip.git
$ python3 -m venv --without-pip tenv
$ tenv/bin/python pip/src/pip --version
pip 20.2.dev0 from ...[snip]
$ git clone https://github.com/abadger/pip.git -b fix-pip-to-work-in-the-face-of-enum34
$ python3 -m venv --without-pip tenv
$ tenv/bin/python pip/src/pip --version
...[snip]
ModuleNotFoundError: No module named 'pip'

This is the same mechanism used by BuildEnvironment.install_requirements(), made possible by runpy.run_path() (which also makes invocation from wheel possible). As Paul mentioned in #8214 (comment), it may be argued the BuildEnvironment.install_requirements() implementation should be changed (and thus makes it okay for the above example to not work), but the reasoning you’re giving here is entirely off base.

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

@uranusjr Mmm... Paul seemed to be arguing that invoking pip like that from the CLI is not supported. That it's only supported from within BuildEnvironment right now. @pfmoore is that a method of invocation that you do want to support?

@pradyunsg
Copy link
Member

Having the discussion in 2 places, for the same topic makes it difficult to follow the discussion. Let's keep the all the discussion in #8214, and leave this PR for discussing not-design-decisions details.

@abadger abadger force-pushed the fix-pip-to-work-in-the-face-of-enum34 branch from e3bd222 to 52f8308 Compare May 10, 2020 00:18
@abadger abadger changed the title Fix when pip is invoked as python /path/to/pip/ Fix pip's machinery for creating an isolated build environment so that user installed packages do not override the stdlib May 10, 2020
@smola
Copy link

smola commented Sep 15, 2020

This does not seem to work with Python 3.6 and pip download: pip download --no-binary :all: --no-deps --dest /tmp/vendor-cache-test enum34==1.1.10.

@uranusjr
Copy link
Member

uranusjr commented Apr 3, 2021

Superseded by #9689.

@uranusjr uranusjr closed this Apr 3, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

With Enum34 installed on python3.6+, pip picks up enum34 instead of the stdlib enum package
7 participants