Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composer job run timing out #2416

Open
ElijahLynn opened this issue Aug 4, 2020 · 49 comments
Open

Composer job run timing out #2416

ElijahLynn opened this issue Aug 4, 2020 · 49 comments
Labels
L: php:composer Issues and code for Composer T: bug 🐞 Something isn't working

Comments

@ElijahLynn
Copy link

ElijahLynn commented Aug 4, 2020

Getting this message at end of logs, on more than one occasion, please advise. We are on Dependabot (not preview) and have seen this on more than one occasion now so I figured I'd open an issue.

updater | time="2020-08-04T00:30:13Z" level=info msg="task complete" container_id=job-44344904-updater exit_code=137 job_id=44344904 step=updater
updater | time="2020-08-04T00:30:13Z" level=warning msg="timeout running job" error="waiting for updater: waiting for container: context deadline exceeded" job_id=44344904

image

The link that says "learn more" links to https://docs.github.com/en/github/administering-a-repository/keeping-your-dependencies-updated-automatically.

Package manager/ecosystem

composer

Manifest contents prior to update

Updated dependency

What you expected to see, versus what you actually saw

Expect Dependabot not to timeout.

Images of the diff or a link to the PR, issue or logs

@ElijahLynn ElijahLynn added the T: bug 🐞 Something isn't working label Aug 4, 2020
@ElijahLynn ElijahLynn changed the title error="waiting for updater: waiting for container: context deadline exceeded" job_id=44344904 Timeout: error="waiting for updater: waiting for container: context deadline exceeded" job_id=44344904 Aug 4, 2020
@ElijahLynn
Copy link
Author

I opened an Enterprise Github support ticket for this but didn't get a confirmation because I wasn't logged in, and am having a tough time even after logging in getting an email to verify my email, our mail server must be blocking it.

@jurre
Copy link
Member

jurre commented Aug 6, 2020

Hi @ElijahLynn, it looks like the updates are legitimately timing out, we currently have a hard limit of 30 minutes for an update job, and composer is known to sometimes be on the slow side.

Right now I would recommend that you retry the failed jobs from the UI, and I will create an issue on our backlog to see if we can give some ecosystems a longer timeout.

@ElijahLynn
Copy link
Author

ElijahLynn commented Aug 8, 2020

Thanks @jurre, I have seen Packagist.org (composer's default source) hang sometimes and wonder if that may be it. Can you expose more info on why it is timing out?

The timeout is hanging on a step just after getting some package metadata. I don't think this needs a longer timeout, I think there is a bug here and something needs to be caught and bubbled up.

proxy | 2020/08/08 00:30:10 [154] GET http://repo.packagist.org/p/provider-latest%24f2867f0b89850775aba29f04766de27c2ab9d10e9925545e0dbbebdd3185f1c1.json
proxy | 2020/08/08 00:30:10 [154] 200 http://repo.packagist.org/p/provider-latest%24f2867f0b89850775aba29f04766de27c2ab9d10e9925545e0dbbebdd3185f1c1.json
updater | time="2020-08-08T00:30:29Z" level=info msg="task complete" container_id=job-44795731-updater exit_code=137 job_id=44795731 step=updater
updater | time="2020-08-08T00:30:29Z" level=warning msg="timeout running job" error="waiting for updater: waiting for container: context deadline exceeded" job_id=44795731

The last successful GET 200 happens at 00:30:10 here and then the timeout happens 19 seconds later at 00:30:29.

@ElijahLynn
Copy link
Author

Oh, I see now, yeah, it is 30 minutes as it all started at 00:00:00 in this log https://github.com/department-of-veterans-affairs/va.gov-cms/network/updates/44795731. I wonder why it takes so long as a full composer update on my local is only 4 minutes.

@ElijahLynn
Copy link
Author

Hrm, I am suspecting it is because it is checking if all sub-dependencies need updating as well. I'll look and see if there is a setting to stop that and see how long it takes.

@ElijahLynn
Copy link
Author

K, we removed the indirect dependency resolution and it is still timing out now. @jurre I suspect this is actually timing out for most sites like ours which are Drupal 8/Symfony based. Can you increase the timeout of the Compser-based projects as you mentioned that you will create an issue in your backlog for? Thanks

@jurre
Copy link
Member

jurre commented Aug 13, 2020

Thanks for investigating this further @ElijahLynn, an issue for this is currently on our backlog and we've already shipped some initial improvements needed to start making this possible. Not sure when the final changes will land, but will be sure to report back here!

@jurre
Copy link
Member

jurre commented Aug 17, 2020

We've increased the timeout for composer updates from 30m to 45, would love to hear if this solves your issue @ElijahLynn

@ElijahLynn
Copy link
Author

Thanks for updating the timeout, however, we are still seeing timeouts with the 45m timeout setting.

image

@jurre
Copy link
Member

jurre commented Aug 19, 2020

That's a bummer. Just out of curiosity, does it take this long on your local machine to run composer as well? I see the repo is publicly available so I will try running dependabot on it locally to see what is taking so much time

@sakulstra
Copy link

sakulstra commented Aug 19, 2020

Not sure if I should open another issue or just comment on this one 😅 - seeing the exact same error since switching to github dependabot. Not with composer though, but package-ecosystem: npm. It's a private repo(monrepo with yarn) so I cannot share a public link.

Contacted support for this a few months ago and they "increased the timeout for npm updates from 30m to 45", which didn't help as well.

@jurre
Copy link
Member

jurre commented Aug 19, 2020

Not sure if I should open another issue or just comment on this one

No need to open a separate issue.

It's a private repo(monrepo with yarn)

We've seen this before with large monorepo's, I'm curious how long running yarn upgrade takes on your development machine for this monorepo? I want to make sure there's not something in our infrastructure that causes these jobs to take so long.

@boredland
Copy link

boredland commented Aug 19, 2020

answering for @sakulstra here:

$ rm -rf node_modules **/node_modules && yarn 
...
[4/4] Building fresh packages...
$ lerna run prepare --stream
lerna notice cli v3.22.1
lerna info Executing command in 1 package: "yarn run prepare"
...
Done in 103.68s.

On a t495s with a ryzen7 though.

@jurre
Copy link
Member

jurre commented Aug 19, 2020

Ah, I see you're running lerna, we don't currently support this in Dependabot, I think that's why it's timing out in your case :(

@boredland
Copy link

but we have been using lerna before too, this is a problem since we moved to the "in github"-version.

@jurre
Copy link
Member

jurre commented Aug 19, 2020

but we have been using lerna before too, this is a problem since we moved to the "in github"-version.

It's hard to diagnose what's going on without access to the project, has anything in the setup changed? Lerna kind of works when using path: ../ dependencies and you have to specify the local package with file:../path for it to work with npm.

I can't think of a reason why it would timeout for the GitHub native version, both are using the same version of dependabot-core etc. Have you looked at the logs for any clues?

@ElijahLynn
Copy link
Author

ElijahLynn commented Aug 19, 2020

That's a bummer. Just out of curiosity, does it take this long on your local machine to run composer as well? I see the repo is publicly available so I will try running dependabot on it locally to see what is taking so much time

does it take this long on your local machine to run composer as well?

No, it only takes about 4 minutes to run locally, a full composer update. Thanks

@dbatiste
Copy link

I seem to be getting timeouts as well in GitHub Dependabot - "Dependabot timed out during its update". I'm not seeing any logs, but it may be due to the same reason. I just set it up yesterday, although Dependabot via dependabot.com had been working fine. Repo is: https://github.com/BrightspaceUI/core/network/updates

@dbatiste
Copy link

dbatiste commented Aug 28, 2020

Also timing out (after 45min). https://github.com/BrightspaceUI/core/network/updates/49154767 We're not using lerna.

Looks like log is no longer available. This is start/end from the log.

proxy | time="2020-08-24T07:05:28Z" level=info msg="proxy starting" commit=3cd1ab64727659a693c3eda78362928002409513
  proxy | 2020/08/24 07:05:28 Listening (:1080)
updater | 2020-08-24T07:05:28.424604754 [49154767:WARN:src/devices/src/legacy/serial.rs:319] Detached the serial input due to peer error/close.
updater | time="2020-08-24T07:05:32Z" level=info msg="guest starting" commit=64ceee570eb26e0774b47539f356311691075589
updater | time="2020-08-24T07:05:32Z" level=info msg="starting job..." fetcher_timeout=5m0s job_id=49154767 updater_timeout=45m0s updater_version=0.118.16-7b51a4873b4cc350ad98ad3f250bcd024c42605c
updater | yarn config v1.22.4
updater | success Set "cafile" to "/etc/ssl/certs/ca-certificates.crt".
updater | Done in 0.07s.
updater | INFO <job_49154767> Starting job processing
  proxy | 2020/08/24 07:05:43 [002] GET https://api.github.com:443/repos/BrightspaceUI/core
  proxy | 2020/08/24 07:05:43 * authenticating github api request
...
...
...
proxy | 2020/08/24 07:50:44 [234] 200 https://registry.yarnpkg.com:443/base
updater | time="2020-08-24T07:50:44Z" level=info msg="task complete" container_id=job-49154767-updater exit_code=137 job_id=49154767 step=updater
updater | time="2020-08-24T07:50:44Z" level=warning msg="timeout running job" error="waiting for updater: waiting for container: context deadline exceeded" job_id=49154767

@driskell
Copy link
Contributor

driskell commented Jan 6, 2021

We see the same issue with composer in Symfony/Drupal 8

Looking at the log it's almost as if it is running a composer update PER DEPENDENCY, so that although a composer update is less than a minute - across 50-60 updates it would run it that many times and effectively timeout. It seems to also be hitting the proxy for requests to packages repositories on each check - but not just to check that dependency - it is pulling metadata for every single package in the composer.json even though it's then realising that specific dependency has no update. Not sure if these proxy log entries show it is caching or not.

I can't copy and paste any of the logs as it crashes the browser because it's so long. It seems each log line is inside a DIV so there's thousands of DIVs in the page and the browser just chokes.

@driskell
Copy link
Contributor

driskell commented Jan 6, 2021

Turns out this is due to composer preferring to check for updates to various metadata on each run. However, when you've got Drupal in there, the package repository works differently somehow and there's also many 404s happening too. And it hits all the same files on each run, taking over a minute each dependency.

I've tested a patched version of Composer v2 locally which has a prefer-cache flag. Essentially it will then prefer to use any cache it has rather than downloading files. Assuming dependabot runs in a clean cache state it would then only download each file once, only hit a 404 once (I added caching of 404 across runs).

This reduced my time-per-dependency from over 1 minute to less than 5 seconds. And reduces network activity significantly. Issue is that composer is unlikely to accept the above as prefer-cache is a little weird, and they do have sufficient caching it just only works within a single run. So likely this would be considered by them an architectural issue in Dependabot that causes caching in composer to be ineffective. Best way may be to have the helper stay running for the duration of the check and to have a run-loop inside it that takes each request and responds to it - reusing the same runtime instance of composer - thus allowing all composer caches to come into play. It does mean memory usage is likely to be increased as more and more files will get loaded into memory but this seems no different to me than what composer does during an update request.

@jurre
Copy link
Member

jurre commented Jan 6, 2021

Best way may be to have the helper stay running for the duration of the check and to have a run-loop inside it that takes each request and responds to it - reusing the same runtime instance of composer - thus allowing all composer caches to come into play. It does mean memory usage is likely to be increased as more and more files will get loaded into memory but this seems no different to me than what composer does during an update request.

I don't think that's really feasible with how Dependabot currently works.

I do wonder if we can patch our helper to make more effective use of the prefer-cache option you mentioned, we should be able to get fairly low level control over how composer is ran. What you describe does seem fairly wasteful, it would be great if we could avoid hitting the same resources over and over again.

@driskell
Copy link
Contributor

driskell commented Jan 6, 2021

@jurre the prefer-cache is something I wrote whilst testing. So doesn’t exist currently. And I’m unsure if such an option should even exist.

@driskell
Copy link
Contributor

driskell commented Jan 13, 2021

@jurre I'm having some pretty alarmingly improved results with a few smaller changes. One big improvement, especially for the update process that produces the lock file, was to replace the LibraryInstaller with a custom one that is NOOP like the MetapackageInstaller. It gets rid of the download step. Of course plugins likely still need downloading so they can effect the installations but to be honest not even sure if that is necessary for this purpose? I can imagine for GitHub that might save a huge amount of bandwidth and help save the planet? 😉

To stop the massive network usage for metadata files and try to maintain the composer cache I found a way that is less impactful change than adding options to composer. It involves serialising the composer object (see below - not currently possible) after each run, then loading it on next run and just replacing the RootPackage and Locker with new versions based on the run's composer.json/lock. It reduces the network requests from Composer to literally just those needed for the additional run. Only issue is it feels a bit hacky replacing the RootPackage and Locker and technically the config source stuff still points to that from the first run - but could be worth checking with Composer if it's save mid-run to replace the RootPackage and Locker to re-run installations etc.

Regarding not able to serialise - it's one single object, the CurlDownloader - that stores a resource in constructor. Just needs a __wakeup to reinitialise.

Will try to get back to this soon again and raise a PR possibly for the LibraryInstaller stub unless someone gets to it - as that seems like might speed things overall in a relatively simple way by avoiding download packages.

In my test Drupal 8 Symphony project - with many updates needed - the dry-run drops from >1 hour to about 30 minutes with caching alone - then with the LibraryInstaller NOOP it drops to about 10 minutes.

@driskell
Copy link
Contributor

Gonna do testing with just the LibraryInstaller NOOP. I am wondering if the caching is not a huge help - or maybe only help with Drupal packages which generate lots of requests as opposed to non-Drupal which don't.

@driskell
Copy link
Contributor

I moved the investigation bits I did into #2999 @jurre and raised #2998 with the LibraryInstaller/InstallationManager bit.
Ultimately I chose to raise some speed up PRs with composer that might help and hopefully it's on the right lines for them and not a dud that makes Drupal updates faster by breaking other things 😬

@markdorison
Copy link

An FYI for others blocked by this issue: while we wait for a Dependabot mitigation/solution, we have implemented an alternative tool as a workaround to keep our Composer packages up to date.

An interesting part of this work was discovering that this other tool is not experiencing the same slowdown that Dependabot is.

@driskell
Copy link
Contributor

@markdorison The improvements noted here are fairly minor - the main slowdowns are due to problems in composer’s dependency resolution. I’m unsure how another tool would work past that other than implement its own resolution code.

@markdorison
Copy link

markdorison commented Aug 25, 2021

@driskell I hear you. Maybe that is what they chose to do? I just wanted to share since it might help others in the short term. We would really like to get Dependabot back up and running on our affected repos.

@driskell
Copy link
Contributor

@markdorison If it helps, it'll be worth looking at the following issues that I think would benefit not only dependabot but users of various frameworks like Drupal. The latter two are ones that specifically target speed issues scanning Drupal projects (and I think any other project with large dependency trees with lots of conflict rules). Things seems to have slowed down as generally speaking from composer side - OK it's slow but it still works. Only becomes a major issue when attempting to automate composer under time constraints.

composer/composer#9261
composer/composer#9619
composer/composer#9620

@markdorison
Copy link

@driskell Thanks for noting all of those here! We are monitoring those as well as #2999 and PR #2998 to keep abreast of any potential resolution.

@markdorison
Copy link

composer/composer#9261 has been merged and included in composer 2.2.8. Would it be possible to see that included in a new dependabot release so we can test its effects?

@jurre
Copy link
Member

jurre commented Mar 15, 2022

composer/composer#9261 has been merged and included in composer 2.2.8. Would it be possible to see that included in a new dependabot release so we can test its effects?

Yep! #4857

Edit: seems the dockerfile for composer 2.2.8 isn't quite ready yet so we'll need to wait on that. I'll check back in tomorrow

@markdorison
Copy link

Thanks @jurre! I'll follow that PR!

@markdorison
Copy link

Tested again once #4861 was included in a release but it is still timing out on our test repo.

@grappler
Copy link

grappler commented Jul 7, 2022

We have been having the same problem. Though these are WordPress setup with over 8 custom VCS repositories for private repositories. Locally, the update only takes 35.07s, but on Dependabot it is reaching the 45-minute limit.

@jeffwidman jeffwidman added the L: php:composer Issues and code for Composer label Sep 14, 2022
@jeffwidman
Copy link
Member

@grappler does it take 35 seconds locally if you start with an empty composer cache?

@grappler
Copy link

The cleared the cache and ran the update command with the profile flag, and it took 74.2 seconds, which is still faster than the 45 minutes timeout.

$ composer clearcache
All caches cleared.
$ composer update --profile
Memory usage: 71.78MiB (peak: 77.02MiB), time: 74.2s

@driskell
Copy link
Contributor

The cleared the cache and ran the update command with the profile flag, and it took 74.2 seconds, which is still faster than the 45 minutes timeout.

$ composer clearcache
All caches cleared.
$ composer update --profile
Memory usage: 71.78MiB (peak: 77.02MiB), time: 74.2s

Worth noting if dependabot is checking for updates to many packages it will do an update per package. So for a composer with 10 packages all with newer versions available it will take 72 seconds for each so more than 720 seconds

@driskell
Copy link
Contributor

Drupal now removed the replaced declaration in 9.4.4 so I expect it to run much quicker as no 404 requests clogging the network per package. But I haven’t tested it.

We moved to only using dependabot for Drupal core packages and using Drupal itself to notify of module updates. Other composer packages tend to appear in Security tab when updates available. So I’ve been somewhat distanced from this issue

@jeffwidman
Copy link
Member

jeffwidman commented Sep 14, 2022

Worth noting if dependabot is checking for updates to many packages it will do an update per package. So for a composer with 10 packages all with newer versions available it will take 72 seconds for each so more than 720 seconds

I started learning about composer yesterday, so I've still got a lot to learn... @driskell I know you're familiar with both composer and dependabot's composer wrapper so a couple of questions to clarify:

  1. Do we not cache the output of the first composer update call? I would think each subsequent call would be quite quick. Or is the problem that we're using composer as a library and so it doesn't cache anything during its calls?
  2. I haven't read through the relevant source, but I assume we're doing the equivalent of composer update vendor/package and never do a broad-based composer update... but even in that case doesn't composer still under the covers have to run a composer update to ensure that there are no conflicts, and then post-network-fetch it filters the changes to only the specified vendor/package call? If it's somehow able to shortcut things, then we'd probably want to do a composer update at the front to pre-warm the cache...

While it's not a near-term priority for us, it'd be nice to move from using composer as a library with native PHP helper code etc to instead using it as a CLI with various flags to extract the information we need... it lets us preserve the native package manager abstraction a lot better. We moved toward this in the go ecosystem, resulting in both speed and correctness improvements, not to maintain lower ongoing maintenance costs. So understanding the above questions would be helpful to understanding how much potential speedup we might get... for example it'd be a huge win if this changeover resulted from no cached calls to re-using a cache for most calls.

@driskell
Copy link
Contributor

  1. Do we not cache the output of the first composer update call? I would think each subsequent call would be quite quick. Or is the problem that we're using composer as a library and so it doesn't cache anything during its calls?

There's no caching dependabot side but I'm not sure what can be cached. Each update is separate and independent and so needs resolving on its own. Everything caching wise would be composer.

  1. I haven't read through the relevant source, but I assume we're doing the equivalent of composer update vendor/package and never do a broad-based composer update... but even in that case doesn't composer still under the covers have to run a composer update to ensure that there are no conflicts, and then post-network-fetch it filters the changes to only the specified vendor/package call? If it's somehow able to shortcut things, then we'd probably want to do a composer update at the front to pre-warm the cache...

Not quite the same as the library method allows specifying custom installers etc. so that Dependabot can prevent any writes or downloads of actual packages. Additionally, to use the CLI, because composer allows deep integrations like composer-patches extension and wikimedia-composer-merge extension it would immediately break because those would attempt to access repository data referenced in the composer.json, such as a patches file, custom extensions from the code base itself, or other composer.json files from downloaded and extracted packages. So you'd soon find you need to not only copy in the composer.json/composer.lock for the run but the entire repository as there's no way to tell what files will be accessed.

While it's not a near-term priority for us, it'd be nice to move from using composer as a library with native PHP helper code etc to instead using it as a CLI with various flags to extract the information we need... it lets us preserve the native package manager abstraction a lot better. We moved toward this in the go ecosystem, resulting in both speed and correctness improvements, not to maintain lower ongoing maintenance costs. So understanding the above questions would be helpful to understanding how much potential speedup we might get... for example it'd be a huge win if this changeover resulted from no cached calls to re-using a cache for most calls.

As noted the CLI doesn't let you stop downloads or installs. It's not intended nor do I think it ever will be implemented to allow the CLI to be run in a way that doesn't actually install or update, and only updates the lock files. So using the CLI will make things slower.

The only way I can think to make things quicker would be to, as you say, update everything at once. As that involves only one round trip to the repository then and one single resolution of package dependencies. That's a very different architecture to what dependabot uses currently.

I think the difference between the go ecosystem example and composer is the go one is extremely simple I think from what I remember of using it. Composer on the other hand is very advanced in its capabilities, with extensibility built in and various package management features like replaces and provides and conflicts and very advanced version constraint resolving, which I think tools like that in go don't come near to (probably because they know how complex it can get, and potentially slow in comparison)

@jeffwidman
Copy link
Member

jeffwidman commented Sep 15, 2022

Thanks, that's extremely helpful context. I had seen that a dry-run flag is coming soon to a composer near you, but makes sense that it still needs to download/install under the covers given the other scenarios you mentioned.

We worked around some of these issues in go by going the opposite route and stubbing out a fake repo at runtime... also abused some undocumented behavior to make it so go get still walks the module tree to grab the go.mod files, but skips downloading/installing the complete module.

I'm going to icebox this CLI migration idea for now... may run some experiments at some point just to see if any other way to do it... I still can envision where the composer caching could be leveraged because at least in other ecosystems within a single job run we start by checking for all updates... it'll generate separate PR's for each update, but if we had a global composer update and then each individual composer update <vendor/package> call would use the original cache. But the code flow may be different here... I haven't looked at the source very closely.

For now though, I'm afraid this won't be a priority given your insights into the difficulties of trying to leverage straight CLI commands.

@driskell
Copy link
Contributor

driskell commented Sep 16, 2022

The dry run I think wouldn’t help - composer bump is just a way of saying “the lock has 1.2.1 but the requirement was ^1.0, increase the requirement to require what’s locked in so the minimum version is increased, ie ^1.2.1”

the dry run just outputs what would be changed in the requirements. No actual check for updates is done it’s just a way of telling composer AFTER an update “ok so now you updated - now make this version the new minimum”. Tbh never been able to understand the need but hey lots of sue cases around these things!

@jeffwidman
Copy link
Member

Thanks! Appreciate you clarifying.

@jeffwidman jeffwidman changed the title Timeout: error="waiting for updater: waiting for container: context deadline exceeded" job_id=44344904 Composer job run timing out Feb 10, 2023
@groened
Copy link

groened commented Aug 4, 2023

Still getting random timeouts against 45 minutes.

@abdulapopoola
Copy link
Member

Hello there, is this still happening?

We've shipped a lot of perf improvements and want to be sure this has improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L: php:composer Issues and code for Composer T: bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests