-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: transport MaxIdleConnsPerHost does not work as expected #13801
Comments
CC @bradfitz |
This has been like this forever, so it's not Go 1.6 material. It's also somewhat involved. |
CL https://golang.org/cl/18282 mentions this issue. |
…n.readLoop In debugging the flaky test in #13825, I discovered that my previous change to tighten and simplify the communication protocol between Transport.roundTrip and persistConn.readLoop in https://golang.org/cl/17890 wasn't complete. This change simplifies it further: the buffered-vs-unbuffered complexity goes away, and we no longer need to re-try channel reads in the select case. It was trying to prioritize channels in the case that two were readable in the select. (it was only failing in the race builder because the race builds randomize select scheduling) The problem was that in the bodyless response case we had to return the idle connection before replying to roundTrip. But putIdleConn previously both added it to the free list (which we wanted), but also closed the connection, which made the caller goroutine (Transport.roundTrip) have two readable cases: pc.closech, and the response. We guarded against similar conditions in the caller's select for two readable channels, but such a fix wasn't possible here, and would be overly complicated. Instead, switch to unbuffered channels. The unbuffered channels were only to prevent goroutine leaks, so address that differently: add a "callerGone" channel closed by the caller on exit, and select on that during any unbuffered sends. As part of the fix, split putIdleConn into two halves: a part that just returns to the freelist, and a part that also closes. Update the four callers to the variants each wanted. Incidentally, the connections were closing on return to the pool due to MaxIdleConnsPerHost (somewhat related: #13801), but this bug could've manifested for plenty of other reasons. Fixes #13825 Change-Id: I6fa7136e2c52909d57a22ea4b74d0155fdf0e6fa Reviewed-on: https://go-review.googlesource.com/18282 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Gerrand <adg@golang.org>
Related, probably to be addressed at the same time: #13957 (limiting number of dials in-flight at once per host) |
Related also: we should close idle conns in some defined order, like the oldest use one gets close first. |
That happened in Go 1.7 at least. The rest of this bug didn't happen, though. |
Also not happening for Go 1.8. |
Otherwise it defaults to 2, and Go will close extra connections as soon as a request is finished. See golang/go#13801
Otherwise it defaults to 2, and Go will close extra connections as soon as a request is finished. See golang/go#13801
We found that when using Credhub with Concourse, there was a massive spike in CPU usage. We profiled ATC (Concourse component that interacts with Credhub) and found that >90% of samples were in TLS handshake methods. We did not profile credhub, but presumably it was much the same. This commit ensures HTTP response body's are read to completion, which is required for net/http to reuse the connection and avoid a costly TLS handshake. It also increases the maximum number of idle HTTP connections to 100, from the default of 2, which is required to work around limitations in net/http: golang/go#13801 Signed-off-by: Ife Runsewe <ife.runsewe@engineerbetter.com>
We found that when using Credhub with Concourse, there was a massive spike in CPU usage. We profiled ATC (Concourse component that interacts with Credhub) and found that >90% of samples were in TLS handshake methods. We did not profile credhub, but presumably it was much the same. This commit ensures HTTP response body's are read to completion, which is required for net/http to reuse the connection and avoid a costly TLS handshake. It also increases the maximum number of idle HTTP connections to 100, from the default of 2, which is required to work around limitations in net/http: golang/go#13801 Signed-off-by: Ife Runsewe <ife.runsewe@engineerbetter.com>
We found that when using Credhub with Concourse, there was a massive spike in CPU usage. We profiled ATC (Concourse component that interacts with Credhub) and found that >90% of samples were in TLS handshake methods. We did not profile credhub, but presumably it was much the same. This commit ensures HTTP response body's are read to completion, which is required for net/http to reuse the connection and avoid a costly TLS handshake. It also increases the maximum number of idle HTTP connections to 100, from the default of 2, which is required to work around limitations in net/http: golang/go#13801 Signed-off-by: Ife Runsewe <ife.runsewe@engineerbetter.com>
We found that when using Credhub with Concourse, there was a massive spike in CPU usage. We profiled ATC (Concourse component that interacts with Credhub) and found that >90% of samples were in TLS handshake methods. We did not profile credhub, but presumably it was much the same. This commit ensures HTTP response body's are read to completion, which is required for net/http to reuse the connection and avoid a costly TLS handshake. It also increases the maximum number of idle HTTP connections to 100, from the default of 2, which is required to work around limitations in net/http: golang/go#13801 Signed-off-by: Ife Runsewe <ife.runsewe@engineerbetter.com>
We will connect many times in parallel to the same DynamoDB server, and with default settings Go will close and re-open connections; see golang/go#13801 Raise MaxIdleConnsPerHost to avoid this. Signed-off-by: Bryan Boreham <bryan@weave.works>
We will connect many times in parallel to the same DynamoDB server, and with default settings Go will close and re-open connections; see golang/go#13801 Raise MaxIdleConnsPerHost to avoid this. Signed-off-by: Bryan Boreham <bryan@weave.works>
This commit limits the number of concurrent connections to encourage connection reuse. `MaxIdleConnsPerHost` was already set to workaround golang/go#13801 ("issue which made Go consume all ephemeral ports"). We observed a similar issue when starting up Loki (using weaveworks `common/aws` throught Cortex) which resulted in thousands of very small S3 requests until memcached was seeded. This failed because of ephemeral port exhaustion: while the application in fact would have reused connections, it already failed as it sent all of them concurrently until most of the requests failed. `MaxConnsPerHost` can be used to limit the overall number of parallel connections.
This commit limits the number of concurrent connections to encourage connection reuse. `MaxIdleConnsPerHost` was already set to workaround golang/go#13801 ("issue which made Go consume all ephemeral ports"). We observed a similar issue when starting up Loki (using weaveworks `common/aws` throught Cortex) which resulted in thousands of very small S3 requests until memcached was seeded. This failed because of ephemeral port exhaustion: while the application in fact would have reused connections, it already failed as it sent all of them concurrently until most of the requests failed. `MaxConnsPerHost` can be used to limit the overall number of parallel connections.
This commit limits the number of concurrent connections to encourage connection reuse. `MaxIdleConnsPerHost` was already set to workaround golang/go#13801 ("issue which made Go consume all ephemeral ports"). We observed a similar issue when starting up Loki (using weaveworks `common/aws` throught Cortex) which resulted in thousands of very small S3 requests until memcached was seeded. This failed because of ephemeral port exhaustion: while the application in fact would have reused connections, it already failed as it sent all of them concurrently until most of the requests failed. `MaxConnsPerHost` can be used to limit the overall number of parallel connections.
We will connect many times in parallel to the same DynamoDB server, and with default settings Go will close and re-open connections; see golang/go#13801 Raise MaxIdleConnsPerHost to avoid this. Signed-off-by: Bryan Boreham <bryan@weave.works>
I recently had an issue which made Go consume all ephemeral ports and start failing to do requests.
The problem was triggered because I was doing 10 concurrent requests with the DefaultMaxIdleConnsPerHost so Transport was only reusing 2 connections and closing the other 8 as soon as a request was finished, just to reestablish a connection again.
This seems a really bad behavior and never happened to me on any other language or http client.
A connection is usually considered idle when it was not used for a period of inactivity, maybe a whole keepalive period or so, not after a request has been sent successfully.
The behavior one would expect is to have 10 established connections and, after say 30 seconds of not issuing any request having only 2 connections established as per DefaultMaxIdleConnsPerHost.
Here is a gist that triggers this behavior.
I'm running go 1.5.2 on Ubuntu Precise 12.04, kernel: Linux e-0000e768 3.8.0-44-generic #66~precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: