Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating venvs via standard library's venv.create(..., with_pip=True) fails for binaries linked against libpython dynamically #381

Open
smheidrich opened this issue Oct 23, 2024 · 14 comments
Labels
bug Something isn't working

Comments

@smheidrich
Copy link

smheidrich commented Oct 23, 2024

Minimal example:

import subprocess
import venv
from tempfile import mkdtemp
from traceback import print_exc

tmp_dir_path = mkdtemp(prefix="venv-create-bug-example.")
print(f"creating venv in: {tmp_dir_path}")
try:
  venv.create(tmp_dir_path, with_pip=True)
except subprocess.CalledProcessError as e:
  print_exc()
  print(f"subprocess stdout:\n{e.stdout}")
  print(f"subprocess stderr:\n{e.stderr}")
  exit(1)

Running this with a python-build-standalone binary that has a shared library dependency on libpython*.so, e.g. cpython-3.12.7+20241016-x86_64-unknown-linux-gnu-install_only will make this fail with (extra newlines inserted for readability):

creating venv in: /tmp/venv-create-bug-example.pj058eb3
Traceback (most recent call last):
  ...
subprocess.CalledProcessError: Command
'['/tmp/venv-create-bug-example.pj058eb3/bin/python', '-m', 'ensurepip', '--upgrade', '--default-pip']'
returned non-zero exit status 127.
subprocess stdout:
b'/tmp/venv-create-bug-example.pj058eb3/bin/python:
error while loading shared libraries:
/tmp/venv-create-bug-example.pj058eb3/bin/../lib/libpython3.12.so.1.0:
cannot open shared object file: No such file or directory\n'
...

Obviously, this is because the python binary links against libpython*.so dynamically with a relative path, and there no longer is a libpython*.so at this path when the Python executable (or a symlink to it) finds itself in the venv.

See also the corresponding issue in uv:

@willnode
Copy link

willnode commented Nov 1, 2024

Maybe helps

$ cd ~/.pyenv/versions/3.11.10
$ ld bin/python
ld: warning: $ORIGIN/../lib/libpython3.11.so.1.0, needed by bin/python, not found (try using -rpath or -rpath-link)
ld: warning: cannot find entry symbol _start; not setting start address
ld: bin/python: undefined reference to `Py_BytesMain'
$ ln -s lib ../lib
$ export LD_LIBRARY_PATH=~/.pyenv/versions/3.11.10:$LD_LIBRARY_PAT
$ ld bin/python
ld: warning: cannot find entry symbol _start; not setting start address

Adding LD_LIBRARY_PATH works so...

// your test file
$ nano x.py
$ python x.py 
creating venv in: /tmp/venv-create-bug-example.fg3ebyf5
$ 

@geofft
Copy link
Collaborator

geofft commented Jan 27, 2025

This is a little bit different from astral-sh/uv#6812. While they're both sort of about libpython.so not being in the venv, the impact is different. In the uv issue, running bin/python works fine, but ldd bin/python reports a spurious issue (which is a bug in ldd), and some build systems are unhappy about not being able to find libpython (which might be because they are trusting ldd, or might be some other, build-system-specific cause). The issue here is that bin/python just doesn't work at all and the venv is completely broken.

You can repro this with the CLI tool:

$ python/bin/python -m venv --copies /tmp/u
Error: Command '['/tmp/u/bin/python', '-m', 'ensurepip', '--upgrade', '--default-pip']' returned non-zero exit status 127.
$ /tmp/u/bin/python
/tmp/u/bin/python: error while loading shared libraries: /tmp/u/bin/../lib/libpython3.13.so.1.0: cannot open shared object file: No such file or directory

The reason is that venv.main() decides between copies and symlinks based on os.name == 'nt', whereas venv.EnvBuilder/venv.create just defaults to symlinks=False. This means, incidentally, that you can work around this issue with venv.create(...symlinks=True), which is probably a good idea for its own sake in that it matches the default behavior of python -m venv on UNIX.

Some fun facts about how other distributions of Python approach this:

  • Nix/NixOS builds their bin/python linking libpython.so, but because they hard-code paths in /nix and set rpaths, a venv created with --copies works fine because the copied Python binary still has the rpath set pointing to the original location of libpython.so. This won't quite work for us because we don't know in advance where someone is unpacking the interpreter (and for e.g. uv it's in the user's homedir which varies with their username). We could modify the copied bin/python to set an rpath, but making the venv bin/python no longer bit-for-bit identical to the original one seems like it might break something else unexpected.
  • The system Python on macOS is patched to prevent using --copies and to change the default behavior of EnvBuilder, and a comment specifically calls out the equivalent problem as the reason they don't allow it:
$ diff -ur <(git -C src/cpython show v3.9.0:Lib/venv/__init__.py) /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/venv/__init__.py
--- /dev/fd/63	2025-01-27 15:22:52
+++ /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/venv/__init__.py	2024-11-11 06:18:32
@@ -16,6 +16,21 @@
 CORE_VENV_DEPS = ('pip', 'setuptools')
 logger = logging.getLogger(__name__)
 
+def should_use_symlinks(symlinks=None):
+    if symlinks:
+        return True
+    else:
+        # The python we build for Xcode uses @executable_path/../Python3 to find Python3.framework
+        # A venv created without symlinks will not be able to load the framework, because
+        # @executable_path would be in the venv, not in the original location within Xcode.
+        symlinks_are_required = sysconfig.get_config_var('TRAIN_STYLE') == 'DT'
+        if symlinks is None:
+            return symlinks_are_required
+        else:
+            if symlinks_are_required:
+                raise Exception("This build of python cannot create venvs without using symlinks")
+            else:
+                return symlinks
 
 class EnvBuilder:
     """
@@ -44,11 +59,11 @@
     """
 
     def __init__(self, system_site_packages=False, clear=False,
-                 symlinks=False, upgrade=False, with_pip=False, prompt=None,
+                 symlinks=None, upgrade=False, with_pip=False, prompt=None,
                  upgrade_deps=False):
         self.system_site_packages = system_site_packages
         self.clear = clear
-        self.symlinks = symlinks
+        self.symlinks = should_use_symlinks(symlinks)
         self.upgrade = upgrade
         self.with_pip = with_pip
         if prompt == '.':  # see bpo-38901
@@ -404,7 +419,7 @@
 
 
 def create(env_dir, system_site_packages=False, clear=False,
-           symlinks=False, with_pip=False, prompt=None, upgrade_deps=False):
+           symlinks=None, with_pip=False, prompt=None, upgrade_deps=False):
     """Create a virtual environment in a directory."""
     builder = EnvBuilder(system_site_packages=system_site_packages,
                          clear=clear, symlinks=symlinks, with_pip=with_pip,

(I have no idea what TRAIN_STYLE is. Some internal Apple thing about release vs. internal builds? "Developer Tools"?)

zanieb added a commit that referenced this issue Jan 28, 2025
…s with a standalone Python (#505)

Per
#381 (comment)
this should stop the build bootsrapping script from crashing when
running the build with a standalone Python distribution (which was..
really annoying me).
@FFY00
Copy link

FFY00 commented Jan 29, 2025

#505 only mitigates the issue, instead of fixing the source.

I feel like the proper fix should be around setting RPATH of the interpreter so that the link can be resolved correctly. This is a bit complex given the relocatable nature of PBS, but given that we are talking about virtual environments, which need to have a base installation, I think it may be reasonable to set the RPATH of the interpreter to point to the base installation.

If the base installation itself is actually meant to be relocatable (as-in it may be moved around), then we might want to warn users about its fragility when creating virtual environments.

@geofft
Copy link
Collaborator

geofft commented Jan 29, 2025

To be clear #505 is just working around this issue for this repository's own Python code in the build scripts (when the Python interpreter itself is from python-build-standalone)—it's not intended as a fix for the outputs of this repository / does not change the outputs.

I'm inclined to say that copy-based venvs are just unsupported for python-build-standalone (as they are for Xcode's Pythons), but yeah, having venv do some rpath patching after it creates the environment would also work, at the risk of maybe breaking code signatures or other things that depend on the copy being bit-for-bit identical to the binary we distribute.

@FFY00
Copy link

FFY00 commented Jan 29, 2025

Does that realistically matter in systems where symlinks are available?

Copy-based venvs exist primarily for Windows, where symlink support is problematic. Are there any use-cases for copy-based venvs in systems like macOS, which implement the kind of code signing you mentioned?

If not, I think the best solution is to simply adjust the UX so that users don't run into these scenarios by mistake.

We could fix the issue with a more complex solution like requiring certain tooling to set the RPATH properly, but I think that if we can avoid that kind of complexity, that would be best. Practically, I think this mostly becomes a non-issue if there aren't any real need world use-cases.

@zanieb
Copy link
Member

zanieb commented Jan 29, 2025

Are there any use-cases for copy-based venvs in systems like macOS, which implement the kind of code signing you mentioned?

Just to share more context, people also want fully portable virtual environments, e.g.,

but I think that's distinct from a typical copy-based venv.

@geofft
Copy link
Collaborator

geofft commented Jan 29, 2025

Even with python3 -m venv --copies, you're still left with a reference in pyvenv.cfg to the original installation's standard library, so I think the premises of 7865 that the given commands are sufficient to get you a venv that's "self contained and does not rely on the user's local files" is not true and the premise of 6782 that venv --copies does what they want is not true. I definitely see the value of giving users a way to get something that is truly self-contained, but I recall seeing somewhere an argument that this is no longer a virtual environment, and what you want at that point is just another unpack of a python-build-standalone distribution and installing packages straight into that distribution.

I don't know of any use case on UNIX actually wants copy-based venvs (as I argued in python/cpython#129382). My only hesitation was that it's a thing the code currently supports and currently defaults to, but I would be more than happy to break it and say symlinks only. Apple seems to have gotten away with doing that.

@FFY00
Copy link

FFY00 commented Jan 29, 2025

Just to share more context, people also want fully portable virtual environments, e.g.,

Just to clarify, by fully portable virtual environments, you mean virtual environments that can be moved around, right?
Or environments that don't have any dependency on the base environment they were created from? Essentially, a copy of the Python distribution?

The former is shouldn't be a problem here, the second should be handled differently from a "venv" as you need a full Python distribution, which has some implications for the Python path/prefix initialization.

Even with python3 -m venv --copies, you're still left with a reference in pyvenv.cfg to the original installation's standard library, so I think the premises of 7865 that the given commands are sufficient to get you a venv that's "self contained and does not rely on the user's local files" is not true and the premise of 6782 that venv --copies does what they want is not true. I definitely see the value of giving users a way to get something that is truly self-contained, but I recall seeing somewhere an argument that this is no longer a virtual environment, and what you want at that point is just another unpack of a python-build-standalone distribution and installing packages straight into that distribution.

Yup, exactly. This is no longer a virtual environment or, at least, not a "lightweight virtual environment", which is what venv implements.

I think that is a desirable feature, but venv is not the place for such implementation. That's something I think uv could do on its own (I'm happy to discuss the best way to implement it), either provided as a different kind of virtual environment users can opt-in, or even via an option to export an existing virtual environment as a standalone environment, sort of like a lightweight version of PyOxidizer that drops the single file requirement.

@zanieb
Copy link
Member

zanieb commented Jan 30, 2025

... but I recall seeing somewhere an argument that this is no longer a virtual environment, and what you want at that point is just another unpack of a python-build-standalone distribution and installing packages straight into that distribution.

Yeah, I agree — that's what I meant when I said it was distinct. I think we're all on the same page here, I was just sharing those use-cases for context on what people are asking for.

I don't know of anyone that's asking for copy-based venvs on Unix. I wouldn't be surprised if something came up if we banned it, but I'd be willing to give it a try here (where release iteration times are much faster than CPython).

@geofft
Copy link
Collaborator

geofft commented Jan 31, 2025

OK, with python/cpython#129493 happening, I propose that on the python-build-standalone side we

  • backport that patch to all existing versions
  • on all versions, also patch in an if not symlinks: raise ValueError("copy-based venvs are not supported in python-build-standalone")

Does that seem reasonable?


I should note for completeness that there is another option, that we (or really probably upstream CPython) copy libpython into the venv, so that the $ORIGIN-relative library dependency works:

$ python/bin/python -m venv --copies --without-pip /tmp/v
$ /tmp/v/bin/python
/tmp/v/bin/python: error while loading shared libraries: /tmp/v/bin/../lib/libpython3.13.so.1.0: cannot open shared object file: No such file or directory
$ cp -a python/lib/libpython3.* /tmp/v/lib
$ /tmp/v/bin/python
Python 3.13.1 (main, Jan 14 2025, 22:38:16) [GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

But I think the consensus here is that this is not worth supporting.

(There is also, I guess, an argument that we should do this and also for symlink-based venvs, we should symlink libpython into the venv, which would resolve astral-sh/uv#6812. #508, etc. But that seems weird because such a symlink is unnecessary and confusing for non-relocatable Python installations, and I worry it will cause people to write build scripts that work properly on python-build-standalone but not on other Python installations.)

@zanieb
Copy link
Member

zanieb commented Jan 31, 2025

we (or really probably upstream CPython) copy libpython into the venv, so that the $ORIGIN-relative library dependency works:

Just to clarify what options are available. We could also detect this case in uv venv and perform this copy, right?

backport that patch to all existing versions

Definitely

also patch in an if not symlinks: raise ValueError("copy-based venvs are not supported in python-build-standalone")

A little more controversial. I'd do so in a separate pull request. It's only broken when compiling extension modules, right? In all other cases, it's fine? Like installing from pre-built wheels will always work? I can't think of a case where copying is preferable to the symlink, which makes me more willing to ban it, but given that the environments are generally usable it feels aggressive. As an alternative solution, we could just special case build failures in uv where the header is missing and the interpreter is not a symlink.

@geofft
Copy link
Collaborator

geofft commented Jan 31, 2025

So there's like three classes of issues that are all very similar-sounding but are not exactly the same thing:

  • Can the Python interpreter find the e.g. libpython3.13.so.1.0 runtime library?
  • Can people trying to compile other programs that call libpython find the libpython3.13.so development library?
  • Once they've successfully compiled such a program, can that program find the libpython3.13.so.1.0 runtime library?

This issue is about the interpreter not finding the runtime library. If you can't find it, then the Python interpreter cannot start up and is completely broken. The reason for this is that it references its own runtime library by relative path from its binary location:

$ readelf -d bin/python | grep libpython
 0x0000000000000001 (NEEDED)             Shared library: [$ORIGIN/../lib/libpython3.13.so.1.0]

If you move/copy the binary to some location and do not preserve a ../lib/libpython3.13.so.1.0, then it will fail to start up.

The specific reason venv.create(symlinks=False, with_pip=True) fails is that the venv module tries to run ensurepip on the venv it just created. If you leave it off, it still creates a venv that does not work, but because it never tries to run code in there, it returns successfully. (Arguably this is a bug.)

On the other hand, if you symlink the binary, then $ORIGIN is interpreted relative to the target of the symlink, and everything still works. (Except in Valgrind, which has a bug that it causes scripts to see $ORIGIN as the script itself instead of the interpreter.)

There's two ways that you can avoid caring about this problem. The first is that you build your Python interpreter so that the libpython guts are statically linked into bin/python and it doesn't carry a runtime dependency on libpython in a separate file. The second is that you install your Python interpreter globally so that instead of having an $ORIGIN-relative dependency, it just has an unqualified libpython3.13.so.1.0 dependency that gets looked up in e.g. /usr/lib. python-build-standalone's Pythons don't do either of these. Neither Xcode's, which is why they have the same problem and why Apple patches their stdlib to throw if you ask it to make a copy-based venv.

(I suppose one thing we can do is to statically link libpython into our interpreter. I'm not sure off hand why we don't, though the argument that comes to mind is disk space.)

The second problem is about compiling (or more specifically, linking) code that needs to link libpython. For those, you generally need to find a libpython3.13.so to satisfy -lpython3.13, and you probably also need to find the <Python.h> C header and its friends, at least if you're writing C. There is, apparently, some build tool that tries to find libpython3.13.so relative to bin/python by convention. This works fine outside a venv but breaks inside one. I don't think we do anything in particular to encourage people to look for a nonexistent venv/lib/libpython3.13.so, but C packaging is all about guessing games, so I don't entirely blame that code if it exists. The best thing we can do is improve that build helper to guess better.

Again, a systemwide Python with the python3-dev package (or whatever your OS calls it) installed is going to have something like a /usr/lib/libpython3.13.so, and -lpython3.13 will get resolved out of there, no need to go finding it specifically. So if there's some build tool that auto-detects stuff based on a venv and does cc whatever.c -Ivenv/include -Lvenv/lib -lpython3.13, then the compiler is going to successfully find /usr/lib/libpython3.13.so (and /usr/include/Python.h) and not care that your -I/-L settings are bogus. But for a relocatable Python like ours, we don't get that benefit.

The third problem is that, supposing you've gotten past the second problem and successfully compiled some program that links libpython, your program is going to need to locate the runtime library in order to run. Again, for a systemwide Python this just works. But for a relocatable Python, you either need to get an $ORIGIN-relative library reference or rpath into your program, or you need to set $LD_LIBRARY_PATH at runtime (or add the relocatable Python to systemwide config).

Note that this third problem is orthogonal from venvs; it's pretty unlikely that you're going to compile your own program into the same directory as bin/python, so the $ORIGIN/../lib/libpython3.13.so syntax from the Python interpreter itself is not particularly helpful to this program. You almost certainly want to reference libpython3.13.so by absolute path in this case.

So, recapping: this issue is only the first problem of that list. astral-sh/uv#6812 is sort of all three problems. It was reported by demonstrating the Python interpreter failing to find libpython under ldd (which would be problem one, except this is actually just a bug in ldd and not a real problem—the first problem doesn't actually happen in symlink-based venvs). It was reported as a compilation problem, which is the second problem. And it turns out that what it actually is, at least in part, is that the compilation works fine but running something like cargo test, which generates a binary that links libpython, is broken, which is the third problem.

So:

Just to clarify what options are available. We could also detect this case in uv venv and perform this copy, right?

Currently (AIUI) uv venv / the uv-virtualenv crate does only symlink-based venvs on UNIX and only copy-based venvs on Windows. This problem doesn't apply at all to symlink-based venvs. I am not yet caught up on how the Windows side of things works but I don't believe there's an equivalent problem there. So I think there's nothing for uv venv to do currently. If we do implement copy-based venvs for UNIX, then yes, we could detect it, but I'm not sure why we would.

(My parenthetical proposal in the previous comment is to address the second and third problems, which do apply to symlink-based venvs. In that case, yes, uv venv would need to detect the need to symlink both the development-time name libpython3.13.so and the runtime name libpython.3.13.so.1.0, and it would be straightforward to do so.)

also patch in an if not symlinks: raise ValueError("copy-based venvs are not supported in python-build-standalone")

A little more controversial. I'd do so in a separate pull request. It's only broken when compiling extension modules, right? [...] given that the environments are generally usable

No, because this is about the first problem, where the Python interpreter is wholly broken, and venv.create(symlinks=False, with_pip=False) creates a broken, unusable venv without reporting an error to you (and the error message that venv.create(symlinks=False, with_pip=True) gives you is unhelpful).

As an alternative solution, we could just special case build failures in uv where the header is missing and the interpreter is not a symlink.

I don't think we're guaranteed to be within uv here. Even setting aside that this is the python-build-standalone repo—the issue on the uv side is about running cargo test inside an activated venv that was created by uv, so uv the binary isn't running at the point where we get the failure.

Lemme know if this is still confusing, I can barely keep the various problems straight :)

@zanieb
Copy link
Member

zanieb commented Jan 31, 2025

Great, thank you for the clarification! I'm definitely mixing problems :)

With that context, banning copied interpreters for virtual environments makes sense.

Regarding

(I suppose one thing we can do is to statically link libpython into our interpreter. I'm not sure off hand why we don't, though the argument that comes to mind is disk space.)

The commentary in #44 (comment) and the linked commit 54bf7c6 may provide some context? Though it seems a bit different.

@geofft
Copy link
Collaborator

geofft commented Jan 31, 2025

Oh, the thing that confused me is that that issue/change is about whether to build libpython as static or shared, and my question was about the whether the python binary consumes libpython as static or shared. But those are ordinarily the same question. What confused me is that I'm used to Debian, where /usr/bin/python3 statically links libpython and a shared libpython also exists, but they do something unusual—they build twice, once static and once shared, and they ship the bin/python from the static build but also ship the shared library. Apparently this is for performance concerns raised 23 years ago. Weird. Fedora doesn't seem to do this. So I no longer think there's an argument that python-build-standalone should do something different here :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants