Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a uv build backend #3957

Open
Tracked by #190
chrisrodrigue opened this issue Jun 1, 2024 · 93 comments
Open
Tracked by #190

Add a uv build backend #3957

chrisrodrigue opened this issue Jun 1, 2024 · 93 comments
Assignees
Labels
build-backend enhancement New feature or improvement to existing functionality

Comments

@chrisrodrigue
Copy link

chrisrodrigue commented Jun 1, 2024

uv is a fantastic tool that is ahead of its time. In the same vein as ruff, it is bundling many capabilities that Python developers need into a single tool. It currently provides the capabilities of pip, pip-tools, and virtualenv in one convenient binary.

Python can't build out of the box

As of Python 3.12, which has removed setuptools and wheel from standard Python installations, a user is unable to perform pip install -e . or pip install . in a local pyproject.toml project without pulling in external third-party package dependencies.

This means that in an offline environment without access to PyPI, a developer is dead in the water and cannot even install their own project from source. This is a glaring flaw with Python, which is supposed to be "batteries included."

uv can fix this

I propose that the uv binary expand its capabilities to also function as a build backend.

If uv could natively build projects from source, it would be a game changer!

@chrisrodrigue chrisrodrigue changed the title uv should provide a build backend uv should provide a build backend Jun 1, 2024
@potiuk
Copy link

potiuk commented Jun 1, 2024

I think that if uv could natively build projects from source, it would be a game changer.

I personally believe that this is pretty much against the whole idea of modern approach and splitting the backend vs. frontend responsibilities. The idea (philosophically) behind the backend/frontend split is that the maintainers of the project (via pyproject.toml) choose the backend that should be used to build their tool, while the user installing Python project is free to choose whatever fronted they prefer. This is IMHO a gamechanger in the python packaging and we should rather see a stronger push in that direction than weakening it. And I think it's not going to go back - because more and more projects will revert to use pyproject.toml and backend specification of build environment. In case of Airflow for example - as of December there is no way to install airlfow "natively" without actually installing hatchling and build environment. But we do not give the option to the frontend. You MUST use hatchling in specified version and few other dependencies specified in specific versions to build airflow. Full stop. UV won't be able to make their own choices (It will be able to choose a way how to create such an environment but not to choose what should be used to build airlfow).

But also maybe my understanding of it is wrong and maybe you are proposing something different than I understand.

BTW. I do not think access to PyPI is needed to build a project with build backend. The frontend my still choose any mechanism (including private repos if needed) to install build environment, no PyPI is needed for it, the only requirement is that pyproject.toml specifies the environment.

@chrisrodrigue
Copy link
Author

chrisrodrigue commented Jun 1, 2024

@potiuk

I agree and think the backend and frontend specifications should be separate, I am merely suggesting that uv could provide a build capability similar to its virtualenv capability (uv venv), such that building can be possible for users that don't have one, perhaps via Project API such as uv install. I like the pdm approach of providing its own build backend (pdm-backend) and installing projects as editable by default inside the automatically managed .venv of the project when a user does pdm install.

You do need access to PyPI or some repo hosting the build backend to build a project. Python does not include setuptools or any other build backend in the latest distributions (it used to include at least setuptools). It downloads the build backend specified in the pyproject.toml (or setuptools if none specified) in an isolated environment and uses it to build python projects from source. If --no-build-isolation is specified, it expects the build-backend to already be available in the current system/virtual environment in order to build.

IMO, pip install [-e] . is broken out of the box because it relies on third party dependencies which may not be accessible. In the stdlib we get argument parsing, unit testing, logging, and other amenities, but we can't perform the most fundamental action in the software development process: installing/building your code.

Without the capability to build from source out of the box, Python is hamstringed and can only run the most rudimentary scripts, leaving users with multi-module projects to resort to ugly PYTHONPATH/sys.path hacks to get their packages/subpackages/modules found by the interpreter.

@potiuk
Copy link

potiuk commented Jun 1, 2024

MO, pip install [-e] . is broken out of the box because it relies on third party dependencies which may not be accessible. In the stdlib we get argument parsing, unit testing, logging, and other amenities, but we can't perform the most fundamental action in the software development process. Without the capability to build from source out of the box, Python is hamstringed and can only run the most rudimentary scripts, leaving users to resort to ugly PYTHONPATH/sys.path hacks to get their packages/subpackages/modules found by the interpreter.

But this is where the whole packaging for Python is heading. The PEPs of packaging precisely specify this is the direction and if project maintainers choose so, unless you manually (like conda) maintain your build recipes for all the packages our there, you won't be able to build project "natively".

Just to give you example of Airflow. Without hatchling and build hooks (implemented in hatch_build.py you are not even able to know what Airflow dependencies are, because "requirements", "optional-requirements" are declared as dynamic fields - and they don't even have (as mandated by the right PEP) specification of those dependencies in pyproject.toml. And we have no setup.py either any more.

The only way to find out what dependencies Airflow needs for editable build, or to build a wheel package is to get the right version of hatchling and get the frontend execute the build_hook - the build hook returrns such dependencies dynamically. You can see it yourself here https://github.com/apache/airflow/blob/main/pyproject.toml -> there is no way to build airflow from sources in current main without actually installing those packages:

    "GitPython==3.1.43",
    "gitdb==4.0.11",
    "hatchling==1.24.2",
    "packaging==24.0",
    "pathspec==0.12.1",
    "pluggy==1.5.0",
    "smmap==5.0.1",
    "tomli==2.0.1; python_version < '3.11'",
    "trove-classifiers==2024.5.22",

And letting hatchling invoke hatch_build.py. I am not sure what you mean by "native" installation - but you won't be able to install airflow differently.

And I think - personally (though I was not part of it) - that the decisions made by the packaging team were pretty sound and smart, and they deliberately left the decision for maintainers of a project to choose the right backend packages needed (and set of 3rd-party tools) and all frontends have no choice but to follow it. I understand you might have different opinion, but here - the process of Python Software Foundation and Packaging team is not an opinion - they have authoritative power to decide it by voting and PEP approval. And the only way to change it is to get another PEP approved.

Here is the list of those PEPs (and I am actually quite happy Airflow after 10 years finally migrated out of setuptools and setup.py by following those standards as finally the tooling - including the modern backends you mentioned support it for sufficently long time):

  • PEP-440 Version Identification and Dependency Specification
  • PEP-517 A build-system independent format for source trees
  • PEP-518 Specifying Minimum Build System Requirements for Python
  • PEP-561 Distributing and Packaging Type Information
  • PEP-621 Storing project metadata in pyproject.toml
  • PEP-660 Editable installs for pyproject.toml based builds (wheel based)
  • PEP-685 Comparison of extra names for optional distribution

@potiuk
Copy link

potiuk commented Jun 1, 2024

And BTW. if uv provides a build backend, you will still be able to choose it when you maintain your project - but it will also be a 3rd-party dependency :)

Someone installing your project with any frontend will have to download and install according to your specification in pyproject.toml. Similarly as hatch/hatchling pair that are separate. Even if you use hatch to install airlfow, it has to anyhow download hatchling in the version specified by the maintainer of the project you are installing and use it to build the package.

@chrisrodrigue
Copy link
Author

chrisrodrigue commented Jun 1, 2024

It's interesting that setuptools remains the "blessed" build backend that pip defaults to use when no backend is specified in the pyproject.toml. Rather than bake setuptools into the stdlib, it silently forces the download and installation of it as an unauthorized third-party build dependency. This also seems to violate the principle of "explicit is better than implicit."

One of the nice things about Astral tools like uv and ruff is that they do not bloat your package/library when you use them, and bundle the capabilities of many tools into single static binaries.

If you use uv specifically for its virtual environment capability, it's a single dependency. Conversely, if you were to use virtualenv, you've now just pulled in 3 more transitive dependencies ( distlib, filelock, platformdirs). Similarly, instead of using pylint, flake8, pyupgrade, black, isort, and all of their dependencies, you only need ruff. This has far reaching implications for companies abiding by Software Bill of Materials (SBOMs) that need to manually vet each piece of FOSS in their software toolchains.

I think uv could provide a build backend that could be specified in pyproject.toml and used as the default when no backend is specified. It doesn't necessarily have to be separate from the main uv binary, since the uv binary could support the build capability as builtin uv/uv pip commands (uv install, uv pip install -e .).

[build-system]
requires = ["uv"]
build-backend = "uv.api"

@potiuk
Copy link

potiuk commented Jun 2, 2024

I think uv could provide a build backend that could be specified in pyproject.toml and used as the default when no backend is specified. It doesn't necessarily have to be separate from the main uv binary, since the uv binary could support the build capability as builtin uv/uv pip commands (uv install, uv pip install -e .).

Well. There are always trade-offs and what you see from your position might be important for you, might not be important for others and the other way. For example - If (like we did with airflow being popular package) you had gone through some of the pains where new releases of setuptools suddently started breaking packages being built - because deliberately or accidentally breaking compatibilities and suddenly being flooded by 100s of your users having problem with installing your package without you doing anything you'd understand that bundling the way how things are run with specific version of frontend is a good idea when you have even moderately big package.

That's what I love that we as maintainers can choose and "lock" the backend to the tools and version of our choice - rather than relying that the version of the tool that our users choose will behave consistently over the years. That was a very smart choice of packaging team based on actual learnings from their millions of users over many years, and while I often criticized their choices in the past, I came to understanding that it's my vision that is short-sighted and limited - I learned a bit of empathy.

I'd strongly recommend a bit more reading and understanding what they were (and still do) cooking there. Python actually deliberately removeed setuptools in order to drive more the adoption of what's being developed as packaging standards (and that's really smart plan that was laid out years ago and is meticulously and consistently, step-by-step put in motion. And I admire the packaging team for that to be honest.

What you really think about is not following and bringing back old "setuptools" behaviour is something else that packaging team has already accepted and a number of tools are implementing https://peps.python.org/pep-0723/ - which allows you to define a small subset of pyproject.toml metadata (specifically dependencies) in the single-file scripts. And this is really where yes - any front-end implementing PEP-723 should indeed prepare a venv, install dependencies and run the script that specifies such dependencies.

Anything more complex that really has a bit more complex packaging need - putting more files together, should really define their backend, in order to maintain "build consistency", otherwise you start to be at mercy of tool developers who might change their behaviours at any time and suddenly not only you but anyone else who want to build your package will suddenly have problems with it.

But yes. If uv provides backend to build packages, that you (as project maintainer) specify in your build dependencies, this is perfectly fine - just one more backend to choose among about 10 available today. And if - as maintainer - you will prefer to specify it in your project, you should be free to do so. But as a maintainer, I would never put faith that future versions of specific front-end (including uv) will continue building my package in the same way in future versions.

BTW. Piece of advise - for that very reason, as a maintainer you should do:

[build-system]
requires = ["uv==x.y.z"]
build-backend = "uv.build"

Otherwise you never know which version of uv people have and whether they have a version that is capable of building your package at all (i.e. old version that has no backend).

@potiuk
Copy link

potiuk commented Jun 2, 2024

Also - again if you have proposal how to improve packaging, there are discourse threads there and PEP could be written, so I also recommend you, if you are strongly convinced that you can come up with a complete and better solution - please start discussion there about new PEP, and propose it, lead to approval and likely help to implement in a number of tools - this is the way how standard in packaging are being developed :)

@notatallshaw
Copy link
Collaborator

It's interesting that setuptools remains the "blessed" build backend that pip defaults to use when no backend is specified in the pyproject.toml. Rather than bake setuptools into the stdlib, it silently forces the download and installation of it as an unauthorized third-party build dependency. This also seems to violate the principle of "explicit is better than implicit."

FYI, I beleive this is because pip maintainers are very conservative when it comes to breaking changes, not because it is the intended future of Python packaging.

For example, the old resolver, which can easily install a broken environment, is still available to use even though the new resolver has been available for over 5 years and turned on by default for over 4 years.

@daviewales
Copy link

A uv-aware build backend would enable private git packages to depend on other private git packages, and still be installed with pip, pipx, poetry, etc, without needing the package end user to change their tools. See #7069 (comment) for a detailed example. (Poetry packages work this way, as they use the poetry-core build backend.)

@chrisrodrigue
Copy link
Author

chrisrodrigue commented Oct 2, 2024

Just some more thoughts, notes, and ramblings on this.

Full stack uv

The build backend capability could be rolled into the uv binary itself as a feature rather than fractured off as a separate dependency.

This would give developers the option to utilize uv as either a frontend, a backend, or both, while still maintaining the single static dependency.

Developers electing to use uv as both frontend and backend could have access to some optimizations that might not be possible individually.

We could call this use case “full stack uv” since that’s usually what we call frontend + backend, right? 🤪

Full stack uv without a specified version

requires = ["uv"]
build-backend = "uv"

Backend uv wouldn’t need to be downloaded since uv can just copy itself into the isolated build environment.

Or, uv can special case itself as a backend and do something even more optimized. This jives with PEP 517:

We do not require that any particular “virtual environment” mechanism be used; a build frontend might use virtualenv, or venv, or no special mechanism at all.

Full stack uv with pinned or maximum version

requires = ["uv==0.5"]
build-backend = "uv"

PEP 517 build isolation can guarantee that frontend uv and backend uv of different versions do not conflict.

A build frontend SHOULD, by default, create an isolated environment for each build, containing only the standard library and any explicitly requested build-dependencies

However, a decision could be made to maintain backward compatibility for the backend, such that newer versions of uv could satisfy whichever version is declared.

On PEP 517 compliance

The backend feature could be PEP 517 compliant so that other frontends (like poetry, pdm, pip, hatch, etc.) can use uv as a backend, but this could be distinct from full stack uv.

A Python library will be provided which frontends can use to easily call hooks this way.

uv would need to expose some hooks in the build environment via Python API. The mandatory and optional hooks at the time of this writing are:

# Mandatory hooks
def build_wheel(wheel_directory, config_settings=None, metadata_directory=None):
    ...

def build_sdist(sdist_directory, config_settings=None):
    ...

# Optional hooks
def get_requires_for_build_wheel(config_settings=None):
    ...

def prepare_metadata_for_build_wheel(metadata_directory, config_settings=None):
    ...

def get_requires_for_build_sdist(config_settings=None):
    ...

Perhaps a highly optimized, importable module or package named uv could be autogenerated by the uv binary at build time to satisfy these requirements? PEP 517 says it could even be cached:

The backend may store intermediate artifacts in cache locations or temporary directories.

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Oct 3, 2024

If it was up to me, and assuming the build backend isn't too complicated, I'd consider doing both:

  1. Have a uv-build package that provides a build backend
  2. The uv frontend detects this backend and special cases the hell out of it to avoid PEP 517/660 overhead.

uv-build can currently basically just depend on uv, but a separate package gives Astral the possibility of adding a pure Python implementation of the build backend or having a slimmer build, since build dependencies with build dependencies is finicky business

@ofek
Copy link
Contributor

ofek commented Oct 3, 2024

Hatch does 2. with Hatchling.

@pawamoy
Copy link

pawamoy commented Dec 22, 2024

Yes, thanks @nathanscain, I guess I'm just a bit surprised 🙂 From what I understood reading the standard, that was not the intended use-case. Unless any other build frontend can also send these build options to the flit backend, then these are not build options, it's just backend logic hardcoded in the frontend.

Worth noting that there are valid reasons to want to limit files that reach the sdist and not just shove everything in for the backend to sort through.

Of course, but that should be specified as backend options, not front-end ones 😄

@pawamoy
Copy link

pawamoy commented Dec 22, 2024

I checked flit's backend source code, and it doesn't use the received config_settings. The frontend just subclasses the SdistBuilder class to enhance it. Compare to PDM's backend, which actually supports build options that can be passed from pyproject-build. OK learned enough for today, sorry for the spam 😄

@nathanscain
Copy link

Oh wow... that is unfortunate...

@eli-schwartz
Copy link

Case and point: flit will add files tracked by your VCS to the sdist by default... but only when you use their frontend of flit build. A contributor not building with the VCS info present or building with a different front end will have any such files excluded from their sdist. Downstream users (distro packagers, gentoo user, etc al) must use the sdist built by the flit frontend to get the files - they can't substitute their own frontend, build from the source code itself stripped of VCS info, and expect to build the same package (even if pulling the same backend).

@nathanscain

sdists do not matter for building the project at all. In fact, when using the github archives generated tarball of a project instead of the sdist, you can build with any frontend plus flit_core and get the same files in the wheel and installed to your python environment.

This is already the case today since many projects using hatchling or poetry don't include important files needed to run testsuites in their sdists, since they explicitly exclude = those files, so building from Github archives is the only option. Sad but true. Flit hasn't meaningfully changed the needle here.

Again, it doesn't matter whether you use flit build or pyproject-build from sdists or github archives or what have you if all you want is to build a wheel. That is actually kind of the point of the flit decision -- the author of flit believes it was wrong for the python ecosystem to ever specify PEP 518 hooks for build_sdist, because building sdists are "the job of a maintainer, not the job of someone who is simply installing the project". It's definitely a... decision. The correct decision would have been to simply not implement build_sdist at all so that you'd have to invoke flit build to produce an sdist, rather than have two ways of producing an sdist.

flit's backend hook for producing an sdist produces an sdist that is useless for a number of purposes that people use sdists for, and can only be used for building a wheel. That's fine if what you want is to build a wheel.

Note also that flit doesn't support dynamic versions from a build stage such as VCS info. For that you need https://pypi.org/project/flit-scm/ (a separate build backend that first runs setuptools-scm to write VCS info to a file and then reimports flit_core to get that version info from the __version__ attribute of your package).

My point was that worth noting that the frontend sending configuration to the backend is a part of the standard and it likely isn't going anywhere. If building from the repo and not the sdist, you either need to configure your frontend to send the same info or use that project's frontend.

The standard only allows sending user configuration to the backend, in the form of pyproject-build's --config-setting / -C option.

I checked flit's backend source code, and it doesn't use the received config_settings. The frontend just subclasses the SdistBuilder class to enhance it. Compare to PDM's backend, which actually supports build options that can be passed from pyproject-build. OK learned enough for today, sorry for the spam 😄

Because flit doesn't support build options.

User configuration in the form of build options is a direct analogue to GNU Autotools / meson / cmake support for things like --with-blas={openblas|netlib|MKL|blis}, or --prefix=/usr, or --gallium-drivers=kmsro,radeonsi,r300,r600,nouveau,freedreno,swrast,v3d,vc4,etnaviv,tegra,i915,svga,virgl,panfrost,iris,lima,zink,d3d12,asahi,crocus,softpipe,llvmpipe

Flit doesn't support dynamism or compiled C extensions so config_settings is irrelevant to it.

@eli-schwartz
Copy link

I didn't suggest was hard, I said it could (and would) lead to descrepencies and that users like myself would be uncomfortable with uv co-opting another tool.

Design choices are made, platforms have quirks, flit calls the Python standard library which uv can't, and there are different implementations of the standard library on certain platforms.

I don't really understand the purpose of this FUD, sorry. Can you please point to me where you think it's philosophically possible to get different results from porting flit_core into rust code, "because it calls the python standard library which yields differences"? I'm not sure what you are getting at. Are you, I dunno, afraid that python's filesystem tree-walking algorithm for iterating over all files in src/ is going to yield a different set of installable files compared to a rust-based filesystem tree-walking algorithm for iterating over all files in src/?

There's no algorithm to copy exactly, because there's no standard to follow.

That is not the meaning of the word "algorithm", and that is not the meaning of the word "standard", so it seems we are at an impasse but not the impasse that I expected.

@pawamoy
Copy link

pawamoy commented Dec 22, 2024

Flit doesn't support dynamism or compiled C extensions so config_settings is irrelevant to it.

The point was that instead of letting users pass something like --include-vcs-tracked to the backend through config_settings (thanks to whatever flit CLI option), flit's frontend extends the backend, tying them together, not allowing other frontends to pass the same options.

EDIT: well, flit build only supports flit's backend anyway, acting as an extended backend, not a frontend. Just like Poetry only supports is own backend I think. PDM did the right thing IMO and properly separated itself from its backend. You can use pdm build with any PEP 517 backend.

@notatallshaw
Copy link
Collaborator

notatallshaw commented Dec 22, 2024

I'm not sure what you are getting at

Trying to identically copy a tool is bad. It leads to ad hoc implied standards, bugs, and descrepencies.

There are a myriad reasons, but I'm firmlymof the opinion the Python packaging system wasn't able to evolve until we were able to move past "do whatever setuptools and pip does". I would be sad for us to move to "do whatever flit does".

That is not the meaning of the word "algorithm", and that is not the meaning of the word "standard", so it seems we are at an impasse but not the impasse that I expected.

I'm talking about a standard in the context of the Python packaging ecosystem which means an accepted PEP, it is all willing and interested parties coming together to agree how something should be defined for implementation. As we're talking in the context of Python packaging I would suggest we should all be using this term this way, anything else is just going to add confusion.

By algorithm I mean something that you could write out confidentially in psudo code. I think this would be non trivial for flit, least of which because of it's giant regexes which depend on the Python regex engine, as well as, yes, it's various IO interfaces which depend on Python implementations. I agree there are other and more broad usages and definitions of the word.

@eli-schwartz
Copy link

Trying to identically copy a tool is bad. It leads to ad hoc implied standards, bugs, and descrepencies.

There are a myriad reasons, but I'm firmlymof the opinion the Python packaging system wasn't able to evolve until we were able to move past "do whatever setuptools and pip does". I would be sad for us to move to "do whatever flit does".

Genuinely baffled.

The reason why the Python packaging system wasn't "able to evolve" except by moving past setuptools is because setuptools does not document what it does or how it works and it supports vast numbers of moving parts. But primarily because it exposes an extensive (and again, effectively undocumented) python API which people code against.

This is a design problem but it's not one that a generic axiom can be derived from. There is no logical reason why another build backend has to leave the majority of its own features undocumented.

(Regarding pip, lots of people moved past "do whatever pip does", way back in 2000 and in some cases earlier. Conda, a latecomer to the party, moved past it in 2012. It doesn't actually matter either way -- pip has never constrained the packaging system, it's only constrained workflow UX which I'm not all that interested in and isn't the topic of this issue really.)

I'm talking about a standard in the context of the Python packaging ecosystem which means an accepted PEP, it is all willing and interested parties coming together to agree how something should be defined for implementation. As we're talking in the context of Python packaging I would suggest we should all be using this term this way, anything else is just going to add confusion.

I object to your definition of the word standard in the context of the Python packaging ecosystem!

A standard, in the context of the Python packaging ecosystem, is when multiple ways to do the same generic concept exist and a document (referred to by the term "standard") says how tools should expose an interoperability interface for other tools to use.

Fortuitously, that's not specific to the Python packaging ecosystem -- that's also how the term "standard" is used in other fields of endeavor -- the Python packaging ecosystem part is that standards are proposed via the PEP process.

Standards are unrelated to whether a specific program documents its own behavior with enough fidelity to port it to another programming language as necessary. That's not an interoperability topic.

By algorithm I mean something that you could write out confidentially in psudo code. I think this would be non trivial for flit, least of which because of it's giant regexes which depend on the Python regex engine, as well as, yes, it's various IO interfaces which depend on Python implementations. I agree there are other and more broad usages and definitions of the word.

I'm only aware of two places in which flit_core meaningfully utilizes regular expressions and both of them are PEPs that have been transcribed from pseudocode into python's re module so there is no requirement to depend on the python regex engine.

The tool.flit section doesn't support passing user-defined regular expressions as far as I can tell, but I'd love to hear more if you're aware of something I missed.

Regarding IO interfaces which depend on python implementations, I simply don't believe you at all, and if what you claimed were true then it would be impossible to write software in python at all1 because you could never be sure if it behaved according to https://docs.python.org and you would encounter impossible filesystem IO errors that defy your operating system kernel's documentation.

If that's not what you mean then please do explain what you mean when you say that it is not practical for an individual to compare behavior of a python program's usage of IO apis and a rust / C++ program's usage of IO apis and write the same program in multiple programming languages.

Footnotes

  1. this would make me very sad as I like writing software in python

@notatallshaw
Copy link
Collaborator

Genuinely baffled.

Yeah, it seems we don't have enough shared language to communicate on this one, so I'll leave this as my final post to this particular thread, if you're still baffled I don't think me replying any further will help sorry.

The reason why the Python packaging system wasn't "able to evolve" except by moving past setuptools is because setuptools does not document what it does or how it works and it supports vast numbers of moving parts. But primarily because it exposes an extensive (and again, effectively undocumented) python API which people code against.

Okay, but the way that was "fixed" is by definining standards as PEPs, and getting pip and setuptools to follow them (or close enough), that other tools could successfully interoperate with them.

Regarding pip, lots of people moved past "do whatever pip does", way back in 2000 and in some cases earlier. Conda, a latecomer to the party, moved past it in 2012. It doesn't actually matter either way -- pip has never constrained the packaging system, it's only constrained workflow UX which I'm not all that interested in and isn't the topic of this issue really

I think your timeline is off, maybe you're thinking of easy install? Pip was first released in 2008.

Conda is an excellent example of why packaging standards were needed here for both frontend and backend. Conda is not part of the Python packaging ecosystem, it is a meta packager that has to keep its own metadata separate from the Python packaging data. In a huge number of cases the build step of a conda package that packages a Python package is to call pip.

If uv had to do what conda did we wouldn't be talking about front ends or back ends or interoperability. I see this as a great success of the Python packaging standards (PEPs and their implementations).

I object to your definition of the word standard in the context of the Python packaging ecosystem!

I'm a bit confused why because you go on to describe the PEP process:

A standard, in the context of the Python packaging ecosystem, is when multiple ways to do the same generic concept exist and a document (referred to by the term "standard") says how tools should expose an interoperability interface for other tools to use.

This is the PEP process and the standards documents that a PEP contributes to is at: https://packaging.python.org/en/latest/specifications/

If that's not what you mean then please do explain what you mean when you say that it is not practical for an individual to compare behavior of a python program's usage of IO apis and a rust / C++ program's usage of IO apis and write the same program in multiple programming languages.

That's not what I said or what I meant, but I genuinely don't think it's worth hashing over between the two of us.

It only matters if Astral think that 1) creating a flit fast path meets their design goals? And 2) if they agree with you that it's doable to copy flits behavior so exactly that it will not cause issues (at least for them)?

Though I suspect the answer to 1) though is no, making hashing over 2) doubly pointless! But we'll see.

@potiuk
Copy link

potiuk commented Dec 22, 2024

If anything - the length of this thread indicates how important and how complex the problem is. Which should - I hope - give astral team a bit of a pause and do a lot of consideration of what they are going to do .

@potiuk
Copy link

potiuk commented Dec 22, 2024

And since holidays are coming - I hope we can all stay with our families over the next few days and get a bit of a pause on all github-related stuff.

Happy Holidays everyone !

@eli-schwartz
Copy link

I think your timeline is off, maybe you're thinking of easy install? Pip was first released in 2008.

My timeline (2000 for the date when people moved past the need for tools first released in 2008) was highly intentional and could be seen as a wry comment about pip itself. Remember that pip is merely a workflow tool and many people never adopted it in the first place.

It's simply a bad comparison to build backends.

Conda is not part of the Python packaging ecosystem, it is a meta packager that has to keep its own metadata separate from the Python packaging data.

None of that metadata is defined by pip-centered standards, and conda also distributes the metadata from build backends, just like pip does. The additional metadata that conda tracks but pip doesn't track is metadata that requires packaging and distributing non-python software, which naturally doesn't participate in localized python packaging ecosystem standards and never will (conda installs pure python wheels the same way pip does, and pip doesn't install GCC or boost or OpenBLAS at all but conda does).

The proposed fixes for this discrepancy between conda and pip have been about enhancing metadata for build backends, not build frontends, because it's not a frontend problem!

But this highlights something that bothers me about the python community as a whole, and it is: people discover that software is hard and ISVs actually do serve a useful role, and respond in fear by refusing to talk about or consider making software a better place except where mandated by a PEP standard. Case in point: "no one should reimplement flit_core in rust unless flit_core is described in a PEP standard. Having flit_core documentation is not good enough because it's not official unless it's a PEP, you can't know how to implement flit_core unless it's a PEP".

Which is a pretty fearful way of looking at the world, I have to say.

I'm a bit confused why because you go on to describe the PEP process:

My apologies for the confusion. In the later two paragraphs I implied that I felt I needed to describe the PEP process because you were not describing the PEP process.

That's not what I said or what I meant, but I genuinely don't think it's worth hashing over between the two of us.

Okay, magic smoke and pixie dust it is.

You originally said "platforms have quirks, flit calls the Python standard library which uv can't, and there are different implementations of the standard library on certain platforms" as a reason for why my suggestion to port a program from python to rust is impractical (not create a competing implementation, program using the same configuration, port the original program).

This analysis worries me because I have a use case (not related to uv) for implementing flit_core according to its documentation (but not in rust specifically) and it sounds like you would call the results buggy due to an intrinsic unreliability problem of the python programming language (but I don't see the bug).

@notatallshaw
Copy link
Collaborator

This analysis worries me because I have a use case (not related to uv) for implementing flit_core according to its documentation (but not in rust specifically) and it sounds like you would call the results buggy due to an intrinsic unreliability problem of the python programming language (but I don't see the bug).

I would be happy to discuss on the relevant repo in detail, I believe enough detail has been discussed here given no one for Astral has participated in a while, especially on their design goals.

It could work quite well for your needs, which I think are likely different from uv's ("just works" for millions of users and configurations and platforms and impacts a significant % of the packaging ecosystem).

Bluntly said though, I don't appreciate the magic smoke, FUD, and other subjective negative meta comments, I won't engage with you in any context if they continue. I appreciate you're frustrated I'm not willing to be dragged into details and sticking to high level examples and points, but I don't think it's worth writing technical papers here, Astral know better their technical experience implementing Python packaging tooling in Rust than I do. Maybe I'm 100% wrong. We're all here in good faith trying to help the Python packaging ecosystem with the best experience we can, and I think we've all got relevant experience here.

@charliermarsh
Copy link
Member

charliermarsh commented Dec 22, 2024

I appreciate all the discussion and we will of course take these concerns into account as we figure out our exact plans here.

For big questions and projects like this, we tend to take our time in working through our answers, since we're a small team (with some folks are out for the holidays in this case) and want to able to think deeply (over reacting quickly) on anything with a long-term impact and complex design considerations. So, please bear with us.

In the meantime, I'd just ask that we curtail any more conversation around the pros and cons. My impression is that everyone has had a chance to speak their minds (and should feel heard -- we'll read it all), and I'd prefer to end with productive disagreement rather than any other negative sentiment.

Happy Holidays!

@T-256
Copy link
Contributor

T-256 commented Dec 24, 2024

To make it both sides of this thread happy I want to point to @hauntsaninja's comment above.

In addition, uv-build package can be pure python, small size, 0 dependency and universal build backend that other frontends could profit. In there, uv as fronted, internally, most treat with uv-build same as uv build backend.

Definitely, that would have some maintainability costs:

  • sync features with uv internals
  • sync version bumps
  • minimum supported python version

Current vs Future implementation

I think that current path of implementation is correct (which is still in preview) that it provides an option for build backend (which is correctly not set as default) to force every build frontend to use uv's binary.
With that said, in future we most have both uv and uv-build as build backend packages. IMHO, it covers all needs:

Building projects with uv or uv-build as backend and uv as frontend:

uv is able to apply its special features like direct build (--no-force-pep517) to improve performance.

Building projects with uv-build as backend with other foreign frontends:

It is uv-build package can be pure python, small size, 0 dependency and universal so every build frontend can be performant with it.

Building projects with uv as backend with other foreign frontends:

Most frontends are using pep517, so they need to download proper uv wheel or in worst case build uv from source.
In best case (which the frontend load uv wheel from its cache), the building process may be more performant (?) than uv-build as pure python backend.

Conclusion

As we can see uv-build could be satisfy both uv as build frontend and other build frontends. While uv as build backend is still there to do the same job as uv-build for the uv as build frontend and apply force to other frontends to use uv's native binary to build the project. Since I see the later one a rare case, I'd recommend make uv-build as the default backend for uv init command.
The general roadmap here would be:

  1. Stabilize uv build backend and expose it as non-default option.
  2. Implement pure python uv-build package with same behavior as uv build backend.
  3. uv internally should now treat uv-build build backend same as uv build backend. all infrastructure already would be there since uv build backend stabilization. So the internal change would be, something like:
    - if project.build_backend == "uv" {
    + if project.build_backend == "uv" || project.build_backend == "uv-build" {

Edit: Only one backend

As I also mentioned above and @pawamoy's comment here Since rarely use of uv build backend with other foreign build frontends, we can also consider this roadmap:

  1. Implement pure python uv-build package with same behavior as current uv build backend.
  2. Internally, in backend section, rename uv to uv-build.
  3. Remove build backend support from uv Python package.
  4. Stabilize uv-build and expose it as default option for build backend for uv init command.

@pawamoy
Copy link

pawamoy commented Dec 25, 2024

Interesting. So, like what Flit is doing actually, without the mistake (IMO) of extending the backend features in the frontend. Also what Hatch is doing based on a comment from @ofek earlier.

I wouldn't recommend having two build-backends though, but only one that is pure-python. Then uv, as a frontend, can detect it and hijack it with a Rust implementation. Other frontends would work as usual with the pure Python backend.

@nathanscain
Copy link

The Astral team requested that we not keep going back and forth with pros/cons etc until they have time to review the current discussion and talk it out internally

As Charlie said, everyone seems to have said their piece and has been heard. Have a Merry Christmas everyone 🎄

@fortminors
Copy link

fortminors commented Jan 30, 2025

@charliermarsh Are there any news on this? Is it discussed only internally or is there some other location where the discussion takes place?

@zanieb
Copy link
Member

zanieb commented Jan 31, 2025

We're discussing this internally. We'll chime in here eventually.

@zanieb
Copy link
Member

zanieb commented Feb 13, 2025

We've been thinking about all the concerns noted here and have some thoughts to share.

As a TL;DR:

  1. We will be releasing a build backend written in Rust but…
  2. We’ll ship it as a separate, minimal package to improve its distributability and flexibility.

I’ll talk through our decision, starting with some background on our motivations.

When we first launched uv's project management capabilities, we tried using a third-party backend by default. The feedback from users about this experience was very negative: there was confusion that errors were coming from another tool, that they needed to read its documentation to understand what was happening, and that they often had to add configuration for it to work with their project. In response to this feedback, we decided not to use a build system in uv init without opt-in (ref). However, we see this as a significant disservice to Python users — using a build system solves real problems with the Python project experience. We still see quite a bit of confusion from users about build backends. Python beginners in particular would be well served by a build backend, but needing to learn another tool to get started adds a lot of complexity. We want to give users a simple and intuitive experience with pure Python projects, but that experience is significantly complicated by the lack of an integrated build backend.

We cannot deliver the user experience we want with existing build backends — we need better error messages, better defaults, and more robust handling of edge cases. Due to the intentionally isolated nature of a build backend, there is context about the user's intent that cannot be captured without integration with the build frontend. For example, an integrated backend lets us understand why the frontend is building the package or why the backend failed so we can provide error messages that guide people to a successful build. We have devoted a lot of resources to improving build error messages already by parsing the output of a build backend to provide hints to users, but that’s brittle.

While it’s a major benefit that we can special case our own backend (as done in Poetry and Hatch), the build backend will be usable with a different frontend than uv and all of the other build backends will continue to be supported by uv’s frontend. We’ve always focused heavily on standards compliance and will continue to do so.

Shifting gears, there are a few alternative suggestions in the discussion:

  1. Implement the backend in both Python and Rust
  2. Implement an existing backend and override use of it (ref)
  3. Don't write a build backend

I'll respond to these next.

With uv, we've invested heavily in a solid foundation for Python packaging in Rust. We have a lot of code and abstractions, with a focus on correctness and performance. It's not feasible for us to maintain duplicate implementations in Python, as in (1). We think this is infeasible both due to time constraints and, as mentioned by some, foundational differences in behavior between the Rust and Python standard libraries — subtly different behavior here would be a disaster. We want to keep our rate of development high and deliver features to users — we can't do that while maintaining multiple versions of our core abstractions.

For similar reasons, we think that (2) is infeasible. Even if we accept the premise that it is feasible, it's against our product principles to shadow another tool implicitly and re-use their configuration.

Additionally, our goal is not just to "make a more performant build backend". As some mentioned, the overhead of a build backend is generally minimal (though noticeable when using uv). We want to innovate on the experience of building Python packages because it is a core part of working on Python projects. The feedback from our users (in the form of questions and bug reports in our issue tracker) has made it clear that improving integration with a build backend will be very high impact.

At this point, we’re at “build a backend in Rust” or “don’t do it at all” (3) — there are a few recurring concerns that support the latter conclusion. We’re taking these concerns seriously, but want to balance them with the value we see for users. I'll do my best to address them here, but I'll admit that packaging is a very hard problem and I do not have good answers to all of the concerns raised.

  1. The Rust toolchain is not as portable as Python (ref, ref, ref)

    Yes, Python has broader platform support than Rust. However, uv is a new project and widespread adoption is going to take time. In the short-term, it seems unlikely that uv adoption will surpass Rust platform support. In the long-term, we expect Rust platform support to continue to grow.

    Additionally, Rust is already a critical component of the Python ecosystem. For example, cryptography and pydantic (both in the top 25 most downloaded packages on PyPI - ref) require Rust and use a Rust build backend (maturin). They are widely used and shipped in all major Linux distributions (e.g., ref). There is a concern that uv is different than these packages, which we will discuss next.

  2. uv is less portable than the average Rust application or library (ref)

    This concern can be split into a couple components (1) the minimum supported Rust version and (2) the number of dependencies. We will attempt to address both of these. (1) We rarely need to update the latest Rust version. We often do so because it addresses tangible problems in our codebase, but it's rare that they're user facing. If this becomes a problem, we can likely hold off on updating. (2) We will split the build backend into a separate package and remove all unnecessary dependencies, e.g., the build backend doesn't need a network stack. Unlike in the main package, we will focus intentionally on simplifying redistribution.

  3. The uv binary is large (ref)

    The uv wheel is ~15MB. In contrast, the setuptools wheel is 1.2MB. We agree it would be a waste of resources to distribute a large binary without strong justification. A brief experiment showed a wheel for a uv build backend package with minimal dependencies (as described above) was also 1.2MB. This is further justification for separating the build backend into its own package.
    For completeness, the flit wheel is ~50kB. We hope to reduce the size of our binary further with more investment.

  4. Downstream packagers build things from source (ref, ref)

    We hope the practical concerns here are largely addressed in our responses to (1) and (2), though we understand building from source is complicated by an additional toolchain. We’re lucky to have engagement from downstream package maintainers already, I’m hoping we continue to get feedback from them to guide this project.

    (As a bit of an aside, it's interesting to hear a primary motivation for using source distributions is access to the test suite. It'd be great to solve running test suites against wheels without requiring a build from source.)

  5. A change to one package's build system can affect the build of many others (ref)

    We hope that maintainers of critical dependencies in the Python ecosystem will be considerate with regards to their dependencies and build system. We agree that defaults matter, but we think they have less effect as a package is used by more people and the maintainer's experience grows. We empathize with the point here, but we want to focus on improving the development experience for the majority of Python users.

With that, I want to take a second to thank everyone for their feedback here. It’s helped us prioritize some changes (like a separate package) that we may not have otherwise.

Going forward, I'd like to focus this issue on the behavior of the build backend within this framing instead of debating if it should be done or not.

@eli-schwartz
Copy link

Downstream packagers build things from source (ref, ref)

We hope the practical concerns here are largely addressed in our responses to (1) and (2), though we understand building from source is complicated by an additional toolchain. We’re lucky to have engagement from downstream package maintainers already, I’m hoping we continue to get feedback from them to guide this project.

(2) should help a lot, I think. In particular it seems reasonable to assume that the MSRV of a dedicated build-backend crate can be a lot lower than the MSRV of uv as a whole.

In particular, if you can avoid entirely the use of problem crates such as ring.

That being said, the arguments with regard to (1) aren't as compelling as they could be. There are CPU architectures where Rust isn't available at all, and therefore cryptography isn't available either. I would say pydantic isn't available either, except that really, pydantic can be commonly downloaded all it wants but it is hardly core infrastructure. Cryptography isn't universally needed, there's absolutely loads of stuff Gentoo users commonly install that are written in Python, without ever ending up hitting a dependency tree that leads back to cryptography... but there are definitely some packages that are uninstallable for Gentoo users on less common architectures, solely because of cryptography.

Build backends are a much bigger influence here than something like pydantic or cryptography, because they could potentially influence "core infrastructure" packages that are written in pure python. For something verging on a worst-case example, what if "trove-classifiers" ported to the uv build backend because they heard it was so popular nowadays, and also it's basically a data package so what's the big deal and who really needs it except for other package developers. Except, setuptools and flit don't care but hatchling depends on it so now a massive number of packages can't be built on non-rust platforms.

There are other packages that are common runtime packages needed by other software which could be a huge problem as well, e.g. attrs, certifi, appdirs, xdg... so we end up in a situation where people are told about a new integrated backend that is just like all the others has better error messages when misconfigured, and they think "sure, I use it for development anyway, might as well use this too" and then break half the world. And it turns out that you can't use uv-backend if you're "important enough" for some nebulous definition of important, because a single package not supporting

Having a pure python version would at least prevent this invisible line across which package maintainers have to magically know that it's socially bad to use a specific backend.

(As a bit of an aside, it's interesting to hear a primary motivation for using source distributions is access to the test suite. It'd be great to solve running test suites against wheels without requiring a build from source.)

I'm not sure it's conceptually solvable. Some projects install their testsuite inside the wheel, that's about all you can do. But most projects consider it bad to install tests as part of the wheel, as it penalizes pip install users.

The handful that install their testsuite tend to argue that they are the exceptional case and that end users creating a virtualenv should then go ahead and import and run the testsuite to make sure everything actually works... because they're pretty confident that there are cases where it doesn't work. It's a common concern in the science ecosystem, for example (delicate interactions between various moving parts can do that for you 🤷).

We may also have to first solve the problem where large subsets of the python packaging ecosystem object to including tests even in the sdist, as they regard sdists as purely for pip to build wheels and see tests as dead weight. That includes going around to other projects and advocating to not include tests in the sdist! :(

One possibility is to download both the wheel and the sdist and then install the wheel and test it using the sdist copy of the test suite. But it's not immediately clear what the advantages are over just building the sdist using a build backend. It would add great complexity to the process, doesn't solve the occasional need to backport a patch that fixes crashes, can't be used for anything that has compiled extensions, doubles or triples the download size... It's something I've thought about before, but with the association "worst case scenario, we might be able to do this to get out of a very sticky situation".

@eli-schwartz
Copy link

I would say pydantic isn't available either, except that really, pydantic can be commonly downloaded all it wants but it is hardly core infrastructure.

I should take this back. Sigstore appears to rely on it, which means it is extremely problematic to validate cryptographic signatures for cpython starting with 3.14. Oh well, we can just assume it was verified on an amd64 developer machine and disable signature verification -- and the Gentoo PGP signature upon the checksum manifest for cpython will prove that that was checked by someone else. Not great, but not the end of the world.

Still, I do admit I was wrong. Pydantic is used for more than I thought...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build-backend enhancement New feature or improvement to existing functionality
Projects
None yet
Development

No branches or pull requests