Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip silently fails to install extras and returns error code #7122

Open
ssbarnea opened this issue Sep 30, 2019 · 52 comments
Open

pip silently fails to install extras and returns error code #7122

ssbarnea opened this issue Sep 30, 2019 · 52 comments
Labels
C: extras Handling optional dependencies kind: backwards incompatible Would be backward incompatible state: awaiting PR Feature discussed, PR is needed type: enhancement Improvements to functionality

Comments

@ssbarnea
Copy link
Contributor

ssbarnea commented Sep 30, 2019

It seems that pip just returns success code on anything related to use of extras, commands like: pip install -e .[non_exiting_extra] reports success.

Even worse, with existing extras that may fail to install, the final result code is still a success, when in fact pip failed to install dependencies.

This is a critical bug because it directly affect CI usage where we can no longer trust result code from pip, bugs may go in unnoticed.

@pradyunsg
Copy link
Member

Not sure what you're asking for. Could you provide a clear reproducer of this issue?

@pradyunsg pradyunsg added the state: needs reproducer Need to reproduce issue label Oct 1, 2019
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Oct 1, 2019
@ssbarnea
Copy link
Contributor Author

ssbarnea commented Oct 1, 2019

@pradyunsg Not sure if a more than the first sentence is needed to reproduce pip failure on ANY project, even one without any extras.

Pip should never return success (0) if an extra is missing or did not succeeded installing. Current behavior is to fully ignore any of these and to return success.

@chrahunt
Copy link
Member

chrahunt commented Oct 1, 2019

Not fully ignore - I believe we trace a warning.

Returning non-zero would be backwards incompatible. I would be +1 on doing it (even aborting installation if we are not able to find an extra), but it would need to be changed over at least 1-2 releases. The first step would be to add to the existing warning to say that it's unsupported and will fail in a future release. We'd also want to go through pypi and see how many packages make reference to nonexistent extras.

@ssbarnea
Copy link
Contributor Author

ssbarnea commented Oct 1, 2019

@chrahunt I am glad to hear that you agree on what should be the desired behavior. Now we only need to agree on a migration path that does not alienate users.

Maybe if we have a --strict switch (PIP_STRICT=1) in pip we could make it easier to to migrate, especially as we can add this to CI systems.

Initially enabling new behavior when is defined and in the future to make it default. The same could happen for other functionalities which are kept only for backwards compatibility.

Regarding current behavior: I think that is visible when it fails to install dependencies from existing extras but when an extra does not exist at all, is totally transparent, I did not see any warning (maybe is only a debug message).

@chrahunt
Copy link
Member

chrahunt commented Oct 1, 2019

If you could provide a reproduction that shows that behavior it would help, I could not reproduce it locally:

#!/bin/sh
cd $(mktemp -d)
echo "from setuptools import setup; setup(name='example')" > setup.py
python -m venv v
./v/bin/python -m pip install --upgrade pip
./v/bin/pip install .[nonexistent_extra]

results in

Collecting pip
  Using cached https://files.pythonhosted.org/packages/30/db/9e38760b32e3e7f40cce46dd5fb107b8c73840df38f0046d8e6514e675a1/pip-19.2.3-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 18.1
    Uninstalling pip-18.1:
      Successfully uninstalled pip-18.1
Successfully installed pip-19.2.3
Processing /tmp/user/1000/tmp.YyteRFYSpH
  WARNING: example 0.0.0 does not provide the extra 'nonexistent_extra'
Installing collected packages: example
  Running setup.py install for example ... done
Successfully installed example-0.0.0

Note the WARNING: example 0.0.0 does not provide the extra 'nonexistent_extra'.

Regarding adding a configuration option, we need to keep #6221 in mind. Not that it can't be done that way, but there's probably a way to do it that wouldn't require a deprecation period of its own.

@xavfernandez
Copy link
Member

Note that this is the behavior of pip since version 6.1 (April 2015):

$ pip install pip==6.1.1
Collecting pip==6.1.1
  Downloading https://files.pythonhosted.org/packages/67/f0/ba0fb41dbdbfc4aa3e0c16b40269aca6b9e3d59cacdb646218aa2e9b1d2c/pip-6.1.1-py2.py3-none-any.whl (1.1MB)
    100% |################################| 1.1MB 531kB/s 
Installing collected packages: pip
  Found existing installation: pip 6.0.8
    Uninstalling pip-6.0.8:
      Successfully uninstalled pip-6.0.8

Successfully installed pip-6.1.1
$ pip install pep8[fake_extra]==1.7.0
Collecting pep8[fake_extra]==1.7.0
  Using cached https://files.pythonhosted.org/packages/8a/cb/7d0fdca7e03f997945fb1bd60a8ddfea5c51229b865c470b4f7a64619d20/pep8-1.7.0-py2.py3-none-any.whl
  pep8 1.7.0 does not provide the extra 'fake_extra'
Installing collected packages: pep8
  Found existing installation: pep8 1.7.1
    Uninstalling pep8-1.7.1:
      Successfully uninstalled pep8-1.7.1
Successfully installed pep8-1.7.0

So calling it a "critical bug" seems a far stretch ;)

Nonetheless it wasn't the behavior of pip 6 that crashed in such case:

$ pip install pip==6.0.8
Collecting pip==6.0.8
  Using cached https://files.pythonhosted.org/packages/63/65/55b71647adec1ad595bf0e5d76d028506dfc002df30c256f022ff7a660a5/pip-6.0.8-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 6.1.1
    Uninstalling pip-6.1.1:
      Successfully uninstalled pip-6.1.1
Successfully installed pip-6.0.8
$ pip install pep8[fake_extra]==1.7.1
Collecting pep8[fake_extra]==1.7.1
  Downloading https://files.pythonhosted.org/packages/42/3f/669429ce58de2c22d8d2c542752e137ec4b9885fff398d3eceb1a7f5acb4/pep8-1.7.1-py2.py3-none-any.whl (41kB)
    100% |################################| 45kB 3.9MB/s 
  Exception:
  Traceback (most recent call last):
    File "/home/xfernandez/.virtualenvs/tmp-fb8674c49b28b72/lib/python2.7/site-packages/pip/basecommand.py", line 232, in main
      status = self.run(options, args)
    File "/home/xfernandez/.virtualenvs/tmp-fb8674c49b28b72/lib/python2.7/site-packages/pip/commands/install.py", line 339, in run
      requirement_set.prepare_files(finder)
    File "/home/xfernandez/.virtualenvs/tmp-fb8674c49b28b72/lib/python2.7/site-packages/pip/req/req_set.py", line 436, in prepare_files
      req_to_install.extras):
    File "/home/xfernandez/.virtualenvs/tmp-fb8674c49b28b72/lib/python2.7/site-packages/pip/_vendor/pkg_resources/__init__.py", line 2504, in requires
      "%s has no such extra feature %r" % (self, ext)
  UnknownExtra: pep8 1.7.1 has no such extra feature 'fake_extra'

According to #2138 and #2142 it might have been to make it consistent with setuptools behavior ?

But I'm on board to progressively deprecate this behavior and make it an error.

@pradyunsg
Copy link
Member

One concern I have is how exactly does pip pass extras down to dependencies -- if it does not, then I'm on board for making this behavior an error.

If we do pass down extras to dependencies (I'm not sure at this point), then we should stop doing that and that'd be required as a step before changing this behavior.

@pradyunsg pradyunsg added C: extras Handling optional dependencies kind: backwards incompatible Would be backward incompatible type: enhancement Improvements to functionality and removed state: needs reproducer Need to reproduce issue labels Oct 7, 2019
@chrahunt
Copy link
Member

chrahunt commented Oct 7, 2019

What do you mean by "pass extras down to dependencies"?

@pradyunsg
Copy link
Member

"pass extras down to dependencies"

Does installing A[extra] mean that pip will try to install B[extra] where A depends on B?

@pfmoore
Copy link
Member

pfmoore commented Oct 7, 2019

My immediate answer would be "no, obviously not". What makes you think it might do? (I'm not super familiar with extras, so I could well have missed something, but I don't recall ever having heard of anything like this).

@pradyunsg
Copy link
Member

Yea, it doesn't.

I seem to be mixing concepts. Nvm me. :)

@ofek
Copy link
Contributor

ofek commented Oct 17, 2020

I'm running into this as well. What's the status of adding a --strict flag?

@uranusjr
Copy link
Member

I don’t think anyone has done anything on it yet, likely since behavioural deprecation is quite messy to deal with and non of the active contributors care enough. I personally think it is perfectly fine for pip to silently drop non-existent extras.

One chance to introduce the breaking change without deprecation would be in the new resolver. We already handle extras slightly differently there, and this would offer a good gauge how many are actually affected by it.

@pradyunsg
Copy link
Member

Is there a good reason to keep the existing behavior? If not, I'm perfectly fine w/ deprecating it and making this an error. 🤷🏽

@pfmoore
Copy link
Member

pfmoore commented Oct 17, 2020

I definitely don't consider this to be a "critical bug", considering how long it's been around before anyone mentioned it. If there were any movement, it would be in the form of a PR, which would have been linked here - so no, there's been no action on this.

Someone will need to create a PR if they want to move this forward - it seems unlikely from the comments here that any of the pip developers are particularly interested in implementing it (I certainly won't be doing so). Such a PR would likely need some discussion around the deprecation process, but it seems pointless to get into that before there's a PR.

@pfmoore
Copy link
Member

pfmoore commented Oct 17, 2020

Is there a good reason to keep the existing behavior? If not, I'm perfectly fine w/ deprecating it and making this an error.

My impression is that most of us are fine with deprecating the current behaviour and changing it to an error, but no-one really cares enough to implement it. @chrahunt covered what's needed for deprecation here.

@uranusjr
Copy link
Member

uranusjr commented Oct 17, 2020

Is there a good reason to keep the existing behavior?

It’s more like there’s no convincing reason to change the existing behaviour for me. I first learnt about extras from setuptools (like most people, I persume), and having non-existent extras to only emit non-critical warning feels like install_requires + extras_require.get(extra_name, []). It makes as much sense as a crashing error (analogous to extras_require[extra_name] raising a KeyError).

@pradyunsg pradyunsg added the state: awaiting PR Feature discussed, PR is needed label Oct 17, 2020
@ofek
Copy link
Contributor

ofek commented Oct 17, 2020

One chance to introduce the breaking change without deprecation would be in the new resolver. We already handle extras slightly differently there, and this would offer a good gauge how many are actually affected by it.

That's a good idea!!!

Are there any strong objections?

@meejah
Copy link

meejah commented Apr 5, 2024

Does the existence of the "waiting PR" tag imply that there is consensus here and a PR would be accepted if it:

  • made this an error
  • (optional) listed the valid extras in such a case (e.g. "Extra 'foo' doesn't exist for 'package'. Valid extras are: bar, quux")

...or would such a PR just produce 5 more years of discussion?

@pfmoore
Copy link
Member

pfmoore commented Apr 6, 2024

Digs like this aren't particularly productive. If you would like to create a PR, please do so.

To answer your question, there's no consensus here yet, and there are some difficult questions that need to be answered, but the only way to really understand what the best approach would be is if someone creates a PR, so there's no real point simply discussing this further without a proposed solution.

One question you might want to consider: If package X version 1.0 has an extra foo, but version 2.0 does not have that extra because the functionality is now part of the main install, what should pip install X[foo] do? Currently we install 2.0 with no error, which is likely what the user wants. We could select 2.0 (because that's what the resolver algorithm decides) and then fail with an error, or we could modify the resolver algorithm (which probably has other implications and backward incompatibilities) to install 1.0 with the extra.

And a second question, based on your suggestion that we "list the valid extras". In a package that has 50 released versions (not impossible) with different extras across those versions (perfectly possible) do we download and check all of those versions, just to list the valid extras?

These are the sorts of things that writing a PR would make more obvious, and which need to be decided before anything can be accepted. Personally, I have no strong opinions on what the "right" behaviour is here, so my preference is the solution that has the least impact on pip's overall behaviour. But what that involves won't be obvious until there's a PR.

@meejah
Copy link

meejah commented Apr 6, 2024

Sorry for the snark.

I have tried to start such a PR a couple times (whenever I get extra-bitten by this) but it's pretty de-motivating to consider it simply sitting here, and I don't really know what the tag means. I'd rather write a PR that has the functionality actually desired by the project. I do like the slogan "rough consensus and running code" but I'm not sure that "rough consensus" has been achieved here..?

My answer to the above questions would be: let the resolver resolve whatever it believes is the right thing, and then it's an error if that extra doesn't exist. The "right" list of extras is the list that exists in whatever version the resolver found. (If you wanted an older version, you'd ask for an older version ... right?)

I actually wouldn't want the behavior you describe: if I say "install X with extra foo" and for whatever reason the version found is 24.4.0, and there is no extra "foo" in that version then I want an error. So I wouldn't want to "succeed anyway" without the extra. I suppose it could be the case that some users might want the "older version that does have the extra" .. I don't think I would? (Even if you do, "explicit is better than implicit" right? So you have to ask by saying "install X version 19.1.0 with extra foo" instead.)

Maybe this indicates that a more philosophical description of "what even is an extra" needs to exist?

Your answers above seem to indicate that an extra is "some bonus functionality that we can more-or-less ignore". My experience is that packages do not follow this; it is often very critical to the correct functioning. For example, switching between "use asyncio" or "use twisted" (but often more than just twisted is needed). So if I'm trying to make a Twisted-using downstream thing work, ignoring the extra will mean my thing doesn't work.

Some of the 5-year-old comments are about CI: if I ask for "foo with extra dev" it's probably because I want all those tools installed to my CI runs. So again, I want an error. Maybe the package re-named the extra at some point from "dev" to "development" at some point -- again I feel the right behavior is "error" (so I can fix my extra string).

I guess my answer is that it's closer to being "part of the package name" and/or "part of the version" but I don't have any deep understanding of the resolver etc.

I am definitely not in a position to know what is "the least impact on pip's behavior" here, and if that's the desired metric it would be extremely helpful if you could write down what you believe those changes consist of (I personally would not use such a metric, but I'm just one user). The "ask" here is a change of behavior, so 🤷‍♀️ does that mean the "least" is to change "warning" to "error" and exit there...?

@RonnyPfannschmidt
Copy link
Contributor

the key problem is that its impossible to off-hand know the difference between a extra that has existed before vs a extra that is misstyped/wrong - and its quite a a pain if pipelines suddenly break

this had created quite some problems for example for setuptools_scm back when i had dropped the now empty toml extra

a number of builds/tools suddenly started to fail, and there was simply no sidechannel to actually warn downstreams

so i just added the extra back to stop breaking downstreams

its generally not well received to break thousands of downstream builds in a way that requires a noncompatible workaround

and its common practice that people don't update certain things unless "forced"

@pfmoore
Copy link
Member

pfmoore commented Apr 6, 2024

Thanks for the comments. Your answers sound like reasonable choices to me - I have little or no personal stake in this functionality, so "whatever people want it to do" is fine with me. We've had a few people commenting that the current behaviour is not suitable for them, so hopefully they will say whether your proposals would suit their use cases.

I am definitely not in a position to know what is "the least impact on pip's behavior" here, and if that's the desired metric it would be extremely helpful if you could write down what you believe those changes consist of

What I was trying to say is that I'd prefer a change that only altered a small part of pip's code, over one that reworked big chunks of the core logic. My main motivation here is to have a PR that stands a realistic chance of getting reviewed (large PRs tend to sit unreviewed because the maintainers don't have the time to work through the impact of the change) and has less risk of unexpected side effects. The only way to know what approach will do that is to write a PR. So you actually are in the best position, if you plan on writing a PR - simply avoid anything that needs major changes to the code.

Maybe this indicates that a more philosophical description of "what even is an extra" needs to exist?

There's certainly an argument for that, yes. I tend to find extras are badly understood by many people, and used in very different ways by different projects. I don't personally have the appetite for a philosophical discussion of that nature, but if you do, by all means start a thread on https://discuss.python.org/c/packaging/14 and see if you can get a community consensus.

@meejah
Copy link

meejah commented Apr 6, 2024

I'd like to apologize again for the unhelpful comment.
Think twice, post once ;)

@meejah
Copy link

meejah commented Oct 2, 2024

I would love some guidance on the above draft PR; if a solution approximately like that proposed works, I can add tests and whatever is required for merging.

Thanks

@meejah
Copy link

meejah commented Oct 2, 2024

Instead of burying more discussion in the draft PR, there seems to be disagreement about the behavior.

Let use consider installing "example_package" with extra "some_extra".

There are two versions of "example_package": 19.0.0 and 24.0.0. In version 19.0.0, there is a single extra called "something_extra" whereas in 24.0.0 there is a single extra called "some_extra".

  • The user asks for example_package[foo]: seems obvious that it's a simple error (no such extra).
  • The user asks for example_package[something_extra] -- here there appears to be disagreement around whether we should re-resolve this to "example_package at version 19.0.0 with something_extra" or not. I think this should also be an error (if the user wanted the old version, they could ask .. e.g. example_package[something_extra]==19.0.0 and explicit is better).
  • The user asks for example_package[some_extra]==19.0.0 -- not mentioned before, but similar issues to the above. Do we magically upgrade to 24.0.0 because it has some_extra? (I don't think so, this seems surprising to me, even though it would be more consistent with the "magically degrade" behavior).

To leave with a bikeshed: a developer on the draft PR asked for this to be behind an option. How do we spell that option? Maybe --error-extras? Also it seems like --strict was mentioned in this thread.

@notatallshaw
Copy link
Member

notatallshaw commented Oct 3, 2024

I think there are two important things to understand:

  1. In general, pip will select a valid candidate given a requirement
  2. A user does not always get to control the requirements

For "1." if foo has two published wheels, 2.0 and 1.0, and 2.0 is only available for Linux but the user is on Windows, and 1.0 is available for all platforms pip will select 1.0. Equally, if the user requires foo[extra] and 2.0 has no extra but 1.0 does, I would expect pip to choose 1.0.

For "2.", a transitive dependency of a user's requirement may require "foo[extra]" and so the requirement is out of control of the user. If it is a requirement that they must put transitive dependencies in their requirements file for pip not to throw an exception, e.g. "foo[extra]==1.0 # This is required for pip not to throw an error", this is a bad user experience.

@meejah
Copy link

meejah commented Oct 3, 2024

Hmm, okay.

So does that mean kind of "moving extras up" into the resolver process? (Currently it's basically a post-hoc test: you decide on something, and then check if there's a matching extra or not).

Maybe --strict-extras for the option?

@notatallshaw
Copy link
Member

notatallshaw commented Oct 3, 2024

I'm not quite sure what you mean, if a user wants an error for foo 2.0 not having an extra, they should require foo[extra]==2.0 (or >= if they will accept newer versions).

As for how to implement this? I think you need to catch this exception in factory.py and skip it, similar to #10625 or the reverse of #10655 (there are other PRs that have implemented skips based on certain conditions, but I couldn't quickly find them).

I would just like to say, I'm personally very supportive of this behavior change, I've always disliked that pip throws a warning but installs anyway. I'd always assumed this was a standard, but reading through this issue and carefully through the standards, it seems this was a mistaken assumption on my side.

@meejah
Copy link

meejah commented Oct 4, 2024

Okay, so I've hacked together something that ostensibly works -- by catching UnavailableExtra in _vendor/resolvelib/... which seems obviously wrong, of course but:

I think at least part of the issue is that the question "what extras does example_package at version 1.0.0 have?" can only be answered by downloading the package (right??).

So if I try to python -m pip install twisted[foo] it's going to do a lot of downloading before it can conclude "twisted never released anything with extra foo".

@notatallshaw
Copy link
Member

Okay, so I've hacked together something that ostensibly works -- by catching UnavailableExtra in _vendor/resolvelib/... which seems obviously wrong

Yeah, resolvelib the library is generic and shouldn't be aware of pip specific concepts like extras, there should be somewhere this can be handled in pip, but it's a tricky processes, and without trying to write the code myself I can't immediately suggest where that would be.

I think at least part of the issue is that the question "what extras does example_package at version 1.0.0 have?" can only be answered by downloading the package (right??).

Yeah, well at least the metadata, for indexes that support PEP 658 then for wheels it will just be the small metadata file.

So if I try to python -m pip install twisted[foo] it's going to do a lot of downloading before it can conclude "twisted never released anything with extra foo".

This is already true of packages where requirements can not be met, we encourage users to put reasonable lower bounds on their requirements. But yes, this is one of the reasons why this change will be disruptive.

Also, optimizations can be made, I think if extras are understood to be a strict requirement it’s possible to do some additional static optimizations while traversing the resolution graph.

@meejah
Copy link

meejah commented Oct 4, 2024

This is already true of packages where requirements can not be met, we encourage users to put reasonable lower bounds on their requirements. But yes, this is one of the reasons why this change will be disruptive.

This won't really lead to the UX I'd like as a user here -- when I fat-finger or otherwise don't know the exact name for an extra ("pip install twisted[ssl]" for example, where tls is correct) -- I don't want to download years of Twisted releases before learning .. nothing?

For the use-case of "a user-typed requirement, on the CLI" in particular (ignoring for now the points above about non-user-controlled requirements, etc) it really feels a lot better to literally turn the existing warning into an error. That is, find the latest release + platform, and if the requested extra is not provided that's an immediate error.

(I say "learning nothing" above because what can I actually output as "available extras" to help the user? The list of every extra I saw with all releases downloaded? This could still be wrong because I'm only on one platform, so won't see "every" release...)

As a concrete example, this is my desired UX (for a human entering CLI commands):

$ python -m pip install twisted[foo] 
Collecting twisted[foo]
  Downloading twisted-24.7.0-py3-none-any.whl.metadata (18 kB)
ERROR: twisted 24.7.0 does not provide the extra 'foo'
twisted 24.7.0 provides extras:
  - all-non-platform
  - conch
  - dev
  - dev-release
  - gtk-platform
  - http2
  - macos-platform
  - mypy
  - osx-platform
  - serial
  - test
  - tls
  - windows-platform

...and this exits with non-zero error-code.

Alternatively, with the "keep looking" proposed UX, I will perhaps notice once the downloader starts getting into non-wheel releases that it is downloading a bunch. At some point, hopefully I notice the messages about "this is taking a long time" and "tighten up your versions". Okay, so I ctrl-c it, and then try like python -m pip install "twisted[foo]>=24.0.0" which will compare several metadata files and then tell me ResolutionImpossible. Without teaching _vendor/resolvelib about extras, I don't see a good way to improve this error-message either -- since those kinds of errors all seem to be "inside" resolvelib? Of course, I've only looked at this a little bit and so perhaps there is a way to achieve an extras-explicit error that I don't see.

This definitely seems at odds with @pfmoore 's request for minimal changes, so I'm not sure how to proceed.

(FWIW, I personally don't want to make "disruptive" changes in pip -- I want to give myself some way to configure pip to turn an already existing warning into an error).

@notatallshaw
Copy link
Member

notatallshaw commented Oct 4, 2024

This won't really lead to the UX I'd like as a user here -- when I fat-finger or otherwise don't know the exact name for an extra ("pip install twisted[ssl]" for example, where tls is correct) -- I don't want to download years of Twisted releases before learning .. nothing?

Similar issues can already happen when you input the wrong requirements. Pip cannot predict what requirements you meant, and it would be a bad philosophy to build a user experience around that. It could however log that it's not selecting a particular version because of a missing extra.

For the use-case of "a user-typed requirement, on the CLI" in particular (ignoring for now the points above about non-user-controlled requirements, etc) it really feels a lot better to literally turn the existing warning into an error. That is, find the latest release + platform, and if the requested extra is not provided that's an immediate error.

The opposite can be true. What if I follow a slightly old guide to install a package, and it recommends an extra, but an incompatible 2.0 version of that package no longer has that extra? This 2.0 was released after the guide was written. So, if pip errors immediately on 2.0, the user gets no feedback that they need to select a previous version of the package.

Alternatively, with the "keep looking" proposed UX,

This "keep looking" UX is the foundation of how package resolvers work for requirement restrictions.

If I request 'a>1' 'b<10', and recent versions of a depend on 'b>10', then pip doesn't just error out. It keeps checking each version of a, going back until it finds a version of a that matches the user requirement 'b<10'. That might involve thousands of packages and might not exist. This could have been a typo by the user — they might have meant 'b<100' — but pip cannot account for user typos like this. (That said, there are optimizations we can, and I plan to, make to improve this)

Having pip throw an error if the latest version of a package doesn't meet the requirement would be a whole new concept for pip and contrary to existing features I'm aware of.

Without teaching _vendor/resolvelib about extras, I don't see a good way to improve this error-message either -- since those kinds of errors all seem to be "inside" resolvelib? Of course, I've only looked at this a little bit and so perhaps there is a way to achieve an extras-explicit error that I don't see.

Again, without seeing your code or trying it myself, I don’t know. But resolvelib reports back the state of the failure to a reporter object that pip provides. That reporter object can give helpful error messages. This is currently a very underutilized feature and may help here.

Any changes to vendor code need to be made in that library, resolvelib is used by more than pip, so its API needs to be highly generic: https://github.com/sarugaku/resolvelib

This definitely seems at odds with @pfmoore 's request for minimal changes, so I'm not sure how to proceed.

I think it's possible to implement this without it being a large code change, but I could be wrong.

What's also important is that the change needs to be minimal from a user-disruptive point of view. If pip starts a new behavior of erroring out immediately when extra requirements aren't met and doesn't attempt to resolve them, this will cause packages to become incompatible with each other in novel ways that can not be easily reasoned about by the user, in fact they may have to manually figure out the correct version of a transitive dependency themselves which invalidates having a resolver.

@pfmoore
Copy link
Member

pfmoore commented Oct 4, 2024

@meejah as you can see, this is why it’s taken so long to get anything done on this - it’s a lot harder problem than it looks.

I’m not at all comfortable with @notatallshaw’s assertion that backtracking is the right thing to do here - you’ve described the way in which this could result in user-unfriendly behaviour, and I agree it’s far from ideal. And unlike a version mismatch, mistyping an extra seems to me like it would in many cases be a simple typo that doesn’t warrant a search through the full dependency tree.

@notatallshaw I see your point about this being a new behaviour, but do you have any evidence that it would be as disruptive as you say? I guess you are thinking of a case where A depends on B[foo], and B 1.0 has extra foo. When B 2.0 is released without extra foo, what happens to pip install A? Both the current behaviour, nor @meejah’s proposed behaviour, could be wrong (the current behaviour if B no longer supports the functionality gated behind extra foo, the proposed behaviour if B merged the behaviour into its default, and dropped the extra as no longer necessary). Your proposal handles this case in the sense that a correct set of packages is installed, but A has no way to specify dependencies that supports both versions of B.

So there’s no perfect solution here. All options have problems, and it’s very arguable that “do nothing” is the least disruptive.

What is clear is that somewhere, we should document why we end up with whatever behaviour we choose, so that people know it’s a deliberate decision, not an accident.

Oh, and what should also be clear by now is why I hate extras 🙁 The design sucks, the implementation is a nightmare, and the user experience is confusing.

@notatallshaw
Copy link
Member

notatallshaw commented Oct 4, 2024

@notatallshaw I see your point about this being a new behavior, but do you have any evidence that it would be as disruptive as you say?

Apologies, this has been intuitively obvious to me, so I've not done a good job explaining. Sleeping on it my point if the following scenario (it's similar to your scenario but I feel highlights how deep the problem is):

Scenario: The user depends on A and B, both have deeply transitive dependencies on foo, B's transitive dependency is (at least initially) also an extrafoo[extra], and that the first transitive dependency pip finds from A on foo does not agree with the first transitive dependency pip finds for B on foo[extra] because foo did not have extra for that version.

If the behavior is modified that pip errors out immediately then the user is left to manually resolve the dependency graph themselves (and almost certainly some users will report an issue to pip and I will be manually resolving the dependency graph ;)).

What is clear is that somewhere, we should document why we end up with whatever behavior we choose, so that people know it’s a deliberate decision, not an accident.

IMO the why is "extras should be treated as real requirements", currently if you install foo[extra] you have no idea if you're going to get [extra], it's a bit like if pip's requirements worked like requesting A B and you definitely got A but you maybe got B. This would need to be written up more formally / user facing.

@pfmoore
Copy link
Member

pfmoore commented Oct 4, 2024

Your scenario is the same as mine, but made more complex to emphasise the problem it causes. My main point here is that someone gets inconvenienced no matter what we do, and complex build hierarchies with invalid extras embedded deep within them is a rare case, so I'd rather not optimise for that situation, but instead ensure that the common scenario (which frankly is pip install A[mistypd_extara]) is handled well. Downloading the whole history of A only to raise an error, is not (IMO) handling that situation well.

The problem with "extras should be treated as real requirements" is that it's not even clear what that statement means. IMO the semantics of extras should be clarified at the standards level. Every tool needs to have a shared understanding of what depending on foo[extra] means, and I don't think it helps users at all if (for example) pip and uv choose different answers to this question.

currently if you install foo[extra] you have no idea if you're going to get [extra]

Hang on, that's a bit of an extreme statement. If you install foo[extra] and foo provides extra, then it'll be installed. The edge cases only arise when things aren't that simple - either foo doesn't provide extra, or it sometimes does and sometimes doesn't, depending on version, or there's an installed version of foo (where we have no way of knowing, because there's no standard for recording it, whether extra was installed as well). Those are precisely the cases where I don't think the standards give enough information on what foo[extra] means in the first place.

@notatallshaw
Copy link
Member

notatallshaw commented Oct 4, 2024

Your scenario is the same as mine, but made more complex to emphasise the problem it causes. My main point here is that someone gets inconvenienced no matter what we do,

Yes, just as pip does now with backtracking, some users are inconvenienced and would prefer pip not to backtrack at all when it can't immediately resolve the requirements, and just error out. Maybe this option should be added and then users can choose?

My main point here is that someone gets inconvenienced no matter what we do, and complex build hierarchies with invalid extras embedded deep within them is a rare case, so I'd rather not optimise for that situation, but instead ensure that the common scenario (which frankly is pip install A[mistypd_extara]) is handled well. Downloading the whole history of A only to raise an error, is not (IMO) handling that situation well.

This is an assumption, that the reason users are installing the vast majority of extras is because they are typing them on the command line, and not receiving them as transitive dependencies. Do you have any evidence for this?

And even if that's the case (which I'm highly skeptical of) why would pip optimize for the user mistyping a requirement? Pip doesn't do that with any other situation, e.g. if a user mistypes requirements on boto3 and urllib3 they can end up downloading thousands of packages.

Hang on, that's a bit of an extreme statement. If you install foo[extra] and foo provides extra, then it'll be installed.

That's not correct though, if a newer or older version of foo could provide that extra and then the version pip checks you don't get that extra. So even if foo[extra] is a valid requirement and a valid extra, you have no idea if you're going to get extra or not, unless you've already resolved the dependency tree yourself.

@notatallshaw
Copy link
Member

notatallshaw commented Oct 4, 2024

The edge cases only arise when things aren't that simple - either foo doesn't provide extra, or it sometimes does and sometimes doesn't, depending on version, or there's an installed version of foo (where we have no way of knowing, because there's no standard for recording it, whether extra was installed as well). Those are precisely the cases where I don't think the standards give enough information on what foo[extra] means in the first place.

There's no standard for how installers should resolve requirements, so it's up to pip to follow its own internal consistency. For example:

  • If I request foo-extra and there is no package foo-extra is available for the user then pip doesn't install anything
  • If I request foo extra<1 and every version of foo has dependency extra>1 then no version of foo is installed

But pip has chosen to treat extras as special, and if it can't immediately find foo[extra] but can find foo it will install foo. If foo[extra] was not treated as special, I think all these edge cases would be resolved like all other requirement edge cases, these choices were already deemed suitable for other requirements, why should extras be special?

@notatallshaw
Copy link
Member

notatallshaw commented Oct 4, 2024

Okay, to me, a reasonable compromise would be, if foo[extra]{specifier} is a top level requirement only (i.e. a user requirement) then add special logic that if pip can not find extra in foo for the first version it finds immediately error out with something like:

You selected foo[extra]{specifier} but the first version of foo that pip found {foo-version} does not have the extra extra, please specify a version of foo that has this extra

This could be added independently of whether pip treats extras as real requirements, and backtracks on them, and would handle the typo scenario.

P.S Sorry for the 3 posts in a row.

@meejah
Copy link

meejah commented Oct 4, 2024

The last comment above sounds like it would do what I want for my particular use-case.


It seems to me that statements like "extras should be treated as real requirements" are getting towards a philosophical position on "what even is an extra?" Such a document / conclusion would be good to have, IMO, no matter what else happens with this PR.

I can appreciate the complexity of some of these scenarios, but my itch is that I do in fact keep getting bitten by mistyping / misremembering an extra name, and I'd really love a way to configure my pip to more-obviously alert me to this scenario. I had thought that --give-me-an-error-not-a-warning combined with making the warning an error would be "least disruptive". Okay, so it turns out that's not true ;) but also I'm not really in a position to decide on a philosophy for extras and also implement it.


If the proposal in the comment above this one isn't acceptable, an even less-invasive thing could be to improve the warning text, and make it appear last in the output? (No opt-in etc needed, because there's no behavior change). Consider the following two examples, of current behavior:

$ uv pip install twisted[foo]
Resolved 10 packages in 351ms
Prepared 8 packages in 107ms
Installed 9 packages in 12ms
 + attrs==24.2.0
 + automat==24.8.1
 + constantly==23.10.4
 + hyperlink==21.0.0
 + idna==3.10
 + incremental==24.7.2
 + twisted==24.7.0
 + typing-extensions==4.12.2
 + zope-interface==7.0.3
warning: The package `twisted==24.7.0` does not have an extra named `foo`

versus

$ pip install twisted[foo]
Collecting twisted[foo]
  Downloading twisted-24.7.0-py3-none-any.whl (3.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 14.9 MB/s eta 0:00:00
WARNING: twisted 24.7.0 does not provide the extra 'foo'
Collecting typing-extensions>=4.2.0
  Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Collecting incremental>=24.7.0
  Downloading incremental-24.7.2-py3-none-any.whl (20 kB)
Collecting attrs>=21.3.0
  Downloading attrs-24.2.0-py3-none-any.whl (63 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.0/63.0 KB 13.2 MB/s eta 0:00:00
Collecting constantly>=15.1
  Downloading constantly-23.10.4-py3-none-any.whl (13 kB)
Collecting hyperlink>=17.1.1
  Downloading hyperlink-21.0.0-py2.py3-none-any.whl (74 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.6/74.6 KB 2.4 MB/s eta 0:00:00
Collecting automat>=0.8.0
  Downloading Automat-24.8.1-py3-none-any.whl (42 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.6/42.6 KB 8.6 MB/s eta 0:00:00
Collecting zope-interface>=5
  Downloading zope.interface-7.0.3-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (254 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 254.1/254.1 KB 16.6 MB/s eta 0:00:00
Collecting idna>=2.5
  Downloading idna-3.10-py3-none-any.whl (70 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 KB 8.1 MB/s eta 0:00:00
Collecting tomli
  Downloading tomli-2.0.2-py3-none-any.whl (13 kB)
Collecting setuptools>=61.0
  Downloading setuptools-75.1.0-py3-none-any.whl (1.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 10.2 MB/s eta 0:00:00
Installing collected packages: typing-extensions, tomli, setuptools, idna, constantly, automat, attrs, zope-interface, incremental, hyperlink, twisted
  Attempting uninstall: setuptools
    Found existing installation: setuptools 58.1.0
    Uninstalling setuptools-58.1.0:
      Successfully uninstalled setuptools-58.1.0
Successfully installed attrs-24.2.0 automat-24.8.1 constantly-23.10.4 hyperlink-21.0.0 idna-3.10 incremental-24.7.2 setuptools-75.1.0 tomli-2.0.2 twisted-24.7.0 typing-extensions-4.12.2 zope-interface-7.0.3
WARNING: You are using pip version 22.0.4; however, version 24.2 is available.
You should consider upgrading via the '/home/meejah/src/pip/venv/bin/python -m pip install --upgrade pip' command.

For me personally, I do type extras on the command-line and I do sometimes "guess" (e.g. "maybe it's example-package[dev] for the development requirements?). So my itch here is to save future-me the time of figuring out which configuration method example-package uses, read it, etc to find out what extras it has. I realize I'm just one user though.

@notatallshaw
Copy link
Member

notatallshaw commented Oct 4, 2024

Great, I would be +1 on a PR that is limited to erroring out on top level requirements. It may still need to issue a deprecation notice for 6 months before changing the behavior, but I don't think it would need an opt-in flag (the behavior is limited, clear, and user actionable).

I would be strongly -1 on erroring out immediately when pip finds a transitive dependency that doesn't match the extra (as I posted in length, lol).

@pfmoore
Copy link
Member

pfmoore commented Oct 4, 2024

Do you have any evidence for this?

No more than anyone does in situations like this. I'm mostly going off the fact that @meejah said that was what they most commonly encountered.

But pip has chosen to treat extras as special

I think that's a mischaracterisation, but I don't have the time right now to debate the history and the various choices made over years of pip's development. Let me just say that if you believe this is a better model, and want to do the work to change pip over to that model, I'll reserve judgement until I see the results. But I'm not convinced that we should be making design decisions based in that model until pip implements it. At the moment, I'm certainly getting confused by the implications of what you're saying, because you're working from a different mental model than I am. And I don't think you're fully understanding my concerns for the same reason.

Okay, to me, a reasonable compromise would be, if foo[extra]{specifier} is a top level requirement only

This feels dangerous to me. You're making a big point that pip treats extras specially, and we shouldn't do that, and yet you're now suggesting we treat top-level requirements as special when that's not something we currently do. I'm nervous about the inconsistency here, and I suspect it'll come back to bite us. For example, is a requirement in a requirements file "top level"? What if it's generated by something like pip-tools?

The only place we currently treat user supplied requirements as special is when it comes to upgrading already-installed requirements. And the behaviour around that is another place where there are a number of problematic edge cases. I don't think we want to repeat that decision - the only reason we have the current behaviour is because it was the least bad way of preserving some level of backward compatibility in a situation where there were no good answers.

It seems to me that statements like "extras should be treated as real requirements" are getting towards a philosophical position on "what even is an extra?" Such a document / conclusion would be good to have, IMO, no matter what else happens with this PR.

Absolutely. This is very much my position, with the qualification that I don't think we can treat that question as something that pip can decide alone. I don't think it's acceptable for different installers (or lockers, or other tools) to behave differently when encountering extras. So we need a common decision on semantics (whether we call that a standard or not - @notatallshaw points out that there's no standard for resolving requirements, but outside of extras, I don't think there's a lot of debate as to what the right answer should be1...)

If the proposal in the comment above this one isn't acceptable, an even less-invasive thing could be to improve the warning text, and make it appear last in the output?

As a general point, improving the visibility of warnings is something I'd support, so +1 from me on something along those lines.

I'm not comfortable with an error only on top-level requirements - certainly not without a lot more clarification of what that would mean and how various edge cases would work. I don't think the benefits are sufficient to justify the added complexity. (I am convinced by what @notatallshaw said that raising an error in all cases isn't realistic).

Basically, this discussion has left me feeling that we're best sticking with the current behaviour (while improving the visibility of the warning). I'm happy to hear opinions from the other maintainers, though.

Footnotes

  1. Actually, there is. I don't think there are good answers to how to resolve when faced with a non-empty target environment, but let's ignore that for now, as it would take us too far off topic.

@notatallshaw
Copy link
Member

notatallshaw commented Oct 4, 2024

Let me just say that if you believe this is a better model, and want to do the work to change pip over to that model, I'll reserve judgement until I see the results. But I'm not convinced that we should be making design decisions based in that model until pip implements it. At the moment, I'm certainly getting confused by the implications of what you're saying, because you're working from a different mental model than I am. And I don't think you're fully understanding my concerns for the same reason.

Yes, I'm now convinced this should be a separate issue. I will make a new issue proposing this new model, and when I have time I will make a PR to prove that model, then the merits can be discussed in that issue/PR, not here.

The only place we currently treat user supplied requirements as special is when it comes to upgrading already-installed requirements. And the behaviour around that is another place where there are a number of problematic edge cases. I don't think we want to repeat that decision - the only reason we have the current behaviour is because it was the least bad way of preserving some level of backward compatibility in a situation where there were no good answers.

This is not true, "top level requirements" are also special in a number of ways, e.g.

  • Pip will error out immediately if they are inconsistent, rather than trying to resolve further
  • They are the only requirements known before trying to resolve
  • They are resolved with a breadth-first search, whereas all other requirements are resolved with a depth-first search

I'm not comfortable with an error only on top-level requirements - certainly not without a lot more clarification of what that would mean and how various edge cases would work. I don't think the benefits are sufficient to justify the added complexity. (I am convinced by what @notatallshaw said that raising an error in all cases isn't realistic).

Pip already does this when top level requirements that are inconsistent. This would be additional special behavior, but top level requirements are special, because pip is able to identify them without needing to take any resolution steps. This is effectively their definition, which in practise is all requirements immediately available to pip via CLI or via requirements files.

@meejah
Copy link

meejah commented Oct 4, 2024

Just so I'm clear, "top-level requirement" means all the requirements collected before we enter the resolver? That is, for pip install, everything collected in reqs before here? https://github.com/pypa/pip/blob/main/src/pip/_internal/commands/install.py#L378

@notatallshaw
Copy link
Member

Just so I'm clear, "top-level requirement" means all the requirements collected before we enter the resolver? That is, for pip install, everything collected in reqs before here? https://github.com/pypa/pip/blob/main/src/pip/_internal/commands/install.py#L378

That is what I meant by this, yes.

@pfmoore
Copy link
Member

pfmoore commented Oct 4, 2024

There's an attribute user_supplied on the requirement object. https://github.com/pypa/pip/blob/main/src/pip/_internal/req/req_install.py#L157. That's what I'd assume you should use - we definitely don't want to end up with multiple different concepts of what's a "top level" requirement...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: extras Handling optional dependencies kind: backwards incompatible Would be backward incompatible state: awaiting PR Feature discussed, PR is needed type: enhancement Improvements to functionality
Projects
None yet
Development

No branches or pull requests