Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(download): work around hyper hang issue by adjusting reqwest config #3855

Merged
merged 2 commits into from
Jun 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions download/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -345,6 +345,10 @@ pub mod reqwest_be {

fn client_generic() -> ClientBuilder {
Client::builder()
// HACK: set `pool_max_idle_per_host` to `0` to avoid an issue in the underlying
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this is necessary/why there is no better solution for this?

Copy link
Member Author

@rami3l rami3l Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djc The underlying issue is hyperium/hyper#2312, see hyperium/hyper#2312 (comment) for the first appearance of this hack. It is suspected that there is a hyper pool deadlock underneath, and this workaround has been used by many to successfully mitigate the issue (see all the external references below).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a good reproduction scenario?

Copy link
Member Author

@rami3l rami3l Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djc Unfortunately not yet, and that's the tricky part of this thing I guess... If it really is a concurrency bug, then it'll be rather hard to track and to fix (that's also why that issue has been open for such a long time I guess, a previous project that I collaborated on has also hit it back in 2020, and the same hack was applied 🤔).

As far as our CI is concerned, we can see several failures in the past 5 days caused by this one, and they are all "solved" by relaunching the test task. The particular test(s) that failed varied from time to time, but I'm sure that all those tests are reqwest-related.

Anyway, this PR intentionally won't close #3852, and I'll keep an eye on hyperium/hyper#2312 as well as our CI in the upcoming weeks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djc The workaround seems to be working: our CI hasn't been failing for this reason in the past 5 days.

// `hyper` library that causes the `reqwest` client to hang in some cases.
// See <https://github.com/hyperium/hyper/issues/2312> for more details.
.pool_max_idle_per_host(0)
.gzip(false)
.proxy(Proxy::custom(env_proxy))
.timeout(Duration::from_secs(30))
Expand Down
17 changes: 9 additions & 8 deletions download/tests/read-proxy-env.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,16 @@ use std::env::{remove_var, set_var};
use std::error::Error;
use std::net::TcpListener;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Mutex;
use std::thread;
use std::time::Duration;

use env_proxy::for_url;
use once_cell::sync::Lazy;
use reqwest::{Client, Proxy};
use tokio::sync::Mutex;
use url::Url;

static SERIALISE_TESTS: Mutex<()> = Mutex::new(());
static SERIALISE_TESTS: Lazy<Mutex<()>> = Lazy::new(|| Mutex::new(()));

fn scrub_env() {
remove_var("http_proxy");
Expand All @@ -29,9 +30,7 @@ fn scrub_env() {
// Tests for correctly retrieving the proxy (host, port) tuple from $https_proxy
#[tokio::test]
async fn read_basic_proxy_params() {
let _guard = SERIALISE_TESTS
.lock()
.expect("Unable to lock the test guard");
let _guard = SERIALISE_TESTS.lock().await;
scrub_env();
set_var("https_proxy", "http://proxy.example.com:8080");
let u = Url::parse("https://www.example.org").ok().unwrap();
Expand All @@ -45,9 +44,7 @@ async fn read_basic_proxy_params() {
#[tokio::test]
async fn socks_proxy_request() {
static CALL_COUNT: AtomicUsize = AtomicUsize::new(0);
let _guard = SERIALISE_TESTS
.lock()
.expect("Unable to lock the test guard");
let _guard = SERIALISE_TESTS.lock().await;

scrub_env();
set_var("all_proxy", "socks5://127.0.0.1:1080");
Expand All @@ -64,6 +61,10 @@ async fn socks_proxy_request() {
let url = Url::parse("http://192.168.0.1/").unwrap();

let client = Client::builder()
// HACK: set `pool_max_idle_per_host` to `0` to avoid an issue in the underlying
// `hyper` library that causes the `reqwest` client to hang in some cases.
// See <https://github.com/hyperium/hyper/issues/2312> for more details.
.pool_max_idle_per_host(0)
.proxy(Proxy::custom(env_proxy))
.timeout(Duration::from_secs(1))
.build()
Expand Down
Loading