Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential race in resty with combination of retries and the buffers pool #225

Closed
hedwigz opened this issue Feb 12, 2019 · 30 comments
Closed
Assignees
Labels

Comments

@hedwigz
Copy link

hedwigz commented Feb 12, 2019

We have a golang project that uses resty as the http client.
We started using the golang race detector in our staging environment and encountered the following race errors:

==================
WARNING: DATA RACE
Write at 0x00c00025e610 by goroutine 149:
  bytes.(*Buffer).Read()
      /usr/local/go/src/bytes/buffer.go:299 +0x4a
  io/ioutil.(*nopCloser).Read()
      <autogenerated>:1 +0x87
  net/http.(*http2clientStream).writeRequestBody()
      /usr/local/go/src/net/http/h2_bundle.go:7825 +0x635
  net/http.(*http2Transport).getBodyWriterState.func1()
      /usr/local/go/src/net/http/h2_bundle.go:9040 +0x13d

Previous write at 0x00c00025e610 by goroutine 109:
  bytes.(*Buffer).Read()
      /usr/local/go/src/bytes/buffer.go:299 +0x4a
  io/ioutil.(*nopCloser).Read()
      <autogenerated>:1 +0x87
  net/http.(*http2clientStream).writeRequestBody()
      /usr/local/go/src/net/http/h2_bundle.go:7825 +0x635
  net/http.(*http2Transport).getBodyWriterState.func1()
      /usr/local/go/src/net/http/h2_bundle.go:9040 +0x13d

Goroutine 149 (running) created at:
  net/http.http2bodyWriterState.scheduleBodyWrite()
      /usr/local/go/src/net/http/h2_bundle.go:9087 +0xb8
  net/http.(*http2ClientConn).roundTrip()
      /usr/local/go/src/net/http/h2_bundle.go:7626 +0x81b
  net/http.(*http2Transport).RoundTripOpt()
      /usr/local/go/src/net/http/h2_bundle.go:7037 +0x22d
  net/http.(*http2Transport).RoundTrip()
      /usr/local/go/src/net/http/h2_bundle.go:6999 +0x4b
  net/http.(*Transport).roundTrip()
      /usr/local/go/src/net/http/transport.go:463 +0xa21
  net/http.(*Transport).RoundTrip()
      /usr/local/go/src/net/http/roundtrip.go:17 +0x42
  net/http.send()
      /usr/local/go/src/net/http/client.go:250 +0x304
  net/http.(*Client).send()
      /usr/local/go/src/net/http/client.go:174 +0x1ca
  net/http.(*Client).do()
      /usr/local/go/src/net/http/client.go:641 +0x53a
  net/http.(*Client).Do()
      /usr/local/go/src/net/http/client.go:509 +0x42
  github.com/Rookout/golib/vendor/gopkg.in/resty%2ev1.(*Client).execute()
      /go/src/github.com/Rookout/golib/vendor/gopkg.in/resty.v1/client.go:819 +0x3dd
  github.com/Rookout/golib/vendor/gopkg.in/resty%2ev1.(*Request).Execute.func1()
      /go/src/github.com/Rookout/golib/vendor/gopkg.in/resty.v1/request.go:482 +0x1c4
  github.com/Rookout/golib/vendor/gopkg.in/resty%2ev1.Backoff()
      /go/src/github.com/Rookout/golib/vendor/gopkg.in/resty.v1/retry.go:85 +0x330
  github.com/Rookout/golib/vendor/gopkg.in/resty%2ev1.(*Request).Execute()
      /go/src/github.com/Rookout/golib/vendor/gopkg.in/resty.v1/request.go:476 +0x58c
  github.com/Rookout/golib/pkg/routes.(*CrudSession).executeInternal()
      /go/src/github.com/Rookout/golib/pkg/routes/crud_session.go:168 +0x47d
  github.com/Rookout/golib/pkg/routes.(*CrudSession).Execute()
      /go/src/github.com/Rookout/golib/pkg/routes/crud_session.go:175 +0xb8
  github.com/Rookout/golib/pkg/routes.(*CrudClient).Post()
      /go/src/github.com/Rookout/golib/pkg/routes/crud_client.go:42 +0x132
  github.com/Rookout/golib/pkg/control.(*Notifications).SendRookRuleStatus.func1()
      /go/src/github.com/Rookout/golib/pkg/control/notifications.go:172 +0x13c

Goroutine 109 (finished) created at:
  net/http.http2bodyWriterState.scheduleBodyWrite()
      /usr/local/go/src/net/http/h2_bundle.go:9087 +0xb8
  net/http.(*http2ClientConn).roundTrip()
      /usr/local/go/src/net/http/h2_bundle.go:7626 +0x81b
  net/http.(*http2Transport).RoundTripOpt()
      /usr/local/go/src/net/http/h2_bundle.go:7037 +0x22d
  net/http.(*http2Transport).RoundTrip()
      /usr/local/go/src/net/http/h2_bundle.go:6999 +0x4b
  net/http.http2noDialH2RoundTripper.RoundTrip()
      /usr/local/go/src/net/http/h2_bundle.go:1019 +0x46
  net/http.(*Transport).roundTrip()
      /usr/local/go/src/net/http/transport.go:415 +0x1269
  net/http.(*Transport).RoundTrip()
      /usr/local/go/src/net/http/roundtrip.go:17 +0x42
  net/http.send()
      /usr/local/go/src/net/http/client.go:250 +0x304
  net/http.(*Client).send()
      /usr/local/go/src/net/http/client.go:174 +0x1ca
  net/http.(*Client).do()
      /usr/local/go/src/net/http/client.go:641 +0x53a
  net/http.(*Client).Do()
      /usr/local/go/src/net/http/client.go:509 +0x42
  github.com/Rookout/golib/vendor/gopkg.in/resty%2ev1.(*Client).execute()
      /go/src/github.com/Rookout/golib/vendor/gopkg.in/resty.v1/client.go:819 +0x3dd
  github.com/Rookout/golib/vendor/gopkg.in/resty%2ev1.(*Request).Execute.func1()
      /go/src/github.com/Rookout/golib/vendor/gopkg.in/resty.v1/request.go:482 +0x1c4
  github.com/Rookout/golib/vendor/gopkg.in/resty%2ev1.Backoff()
      /go/src/github.com/Rookout/golib/vendor/gopkg.in/resty.v1/retry.go:85 +0x330
  github.com/Rookout/golib/vendor/gopkg.in/resty%2ev1.(*Request).Execute()
      /go/src/github.com/Rookout/golib/vendor/gopkg.in/resty.v1/request.go:476 +0x58c
  github.com/Rookout/golib/pkg/routes.(*CrudSession).executeInternal()
      /go/src/github.com/Rookout/golib/pkg/routes/crud_session.go:168 +0x47d
  github.com/Rookout/golib/pkg/routes.(*CrudSession).Execute()
      /go/src/github.com/Rookout/golib/pkg/routes/crud_session.go:175 +0xb8
  github.com/Rookout/golib/pkg/routes.(*CrudClient).Post()
      /go/src/github.com/Rookout/golib/pkg/routes/crud_client.go:42 +0x132
  github.com/Rookout/golib/pkg/control.(*Notifications).SendRookRuleStatus.func1()
      /go/src/github.com/Rookout/golib/pkg/control/notifications.go:172 +0x13c
==================

The code is not open source atm so I cannot show you exactly what wer'e doing but the main takeaways are:

  1. We share a single client across multiple go routines
  2. We create a new request struct for every request we execute (client.NewRequest())
  3. We use the retry feature
    Any idea what's the problem here? Is it a false positive or not?
    Thanks
@jeevatkm jeevatkm self-assigned this Feb 12, 2019
@jeevatkm
Copy link
Member

@amits1995 Based on the information you have shared (looking at groutine 149 & 109 and the stacktrace), it does not originate from Resty.

Could you please investigate it further? if you come across a race condition on resty package, please let me know, I would address as quickly as possible.

@hedwigz
Copy link
Author

hedwigz commented Feb 12, 2019

Hey @jeevatkm thx for the response
It looks like the race is on some buffer the http package is reading from.
It's a buffer the http package gets to read as the body of the http request which is generated in resty and passed to the http package of GO.
Since we pass []byte as body to resty, and it is only later being wrapped with a buffer in resty, I believe resty is somehow maybe reusing the same buffer?
This is my only lead ATM.
WDYT?

@moorereason
Copy link
Contributor

Are you careful to call response.Body.Close()?

@jeevatkm
Copy link
Member

@amits1995 Let me check and get back.

@moorereason the issue seems like on request side not on response.

@jeevatkm
Copy link
Member

@amits1995 I had a look. The only scenario, Resty uses as-is provided body is io.Reader input. For the rest of the scenario, it gets a buffer instance from sync pool and releases back to pool. Are you using io.Reader?

Also just to try, could you please apply this patch at your end and test it out?
try.patch.zip

@hedwigz
Copy link
Author

hedwigz commented Feb 13, 2019

We are not using io.Reader, only []byte
I saw the use of buffers from sync.pool and was suspecting it.
Maybe something with the combination of retries and the buffers pool?
Also, I began wondering whether race detector might actually get confused by the buffers pool. Maybe the race detector detects two goroutines accessing the same buffer in adjacent times but that is legit as different goroutines are meant to be reusing those buffers? Didn't find anything online about that so I guess we should assume race detector actually detected something here which is not false positive.

As for the patch, It's going to be extremely painful to check it out so I'll be happy to know what's in there before :)

@jeevatkm
Copy link
Member

@amits1995 As I mentioned above Resty do not share a buffer between requests.

I would like to confirm this -

Maybe something with the combination of retries and the buffers pool?

That's why I have shared the patch to test. Patch contains following (you could extract the zip archive to get that) -

diff --git a/client.go b/client.go
index 68d447f..476ae8b 100644
--- a/client.go
+++ b/client.go
@@ -785,7 +785,6 @@ func (c *Client) GetClient() *http.Client {
 
 // executes the given `Request` object and returns response
 func (c *Client) execute(req *Request) (*Response, error) {
-	defer releaseBuffer(req.bodyBuf)
 	// Apply Request middleware
 	var err error
 
diff --git a/request.go b/request.go
index c6adff3..4d02897 100644
--- a/request.go
+++ b/request.go
@@ -468,6 +468,7 @@ func (r *Request) Patch(url string) (*Response, error) {
 // 		resp, err := resty.R().Execute(resty.GET, "http://httpbin.org/get")
 //
 func (r *Request) Execute(method, url string) (*Response, error) {
+	defer releaseBuffer(r.bodyBuf)
 	var addrs []*net.SRV
 	var err error

I could create a branch with above patch for your testing, if its gonna make it easy. This way you could test the patch from go get or go mod mechanism.

Please let me know.

@hedwigz
Copy link
Author

hedwigz commented Feb 13, 2019

Yes, the branch approach will be great. Thx

jeevatkm added a commit that referenced this issue Feb 13, 2019
@jeevatkm
Copy link
Member

@amits1995 I have created a branch https://github.com/go-resty/resty/tree/for-225-try-patch please give it a try and let me know.

@jeevatkm
Copy link
Member

@amits1995 Any updates?

@hedwigz
Copy link
Author

hedwigz commented Feb 15, 2019

It'll take a few days I'll update as soon as we deploy

@jeevatkm
Copy link
Member

Okay, please keep me posted.

@hedwigz
Copy link
Author

hedwigz commented Feb 19, 2019

Small update - we deployed a version with your patch.
We didn't see any data races coming from Resty but we did see some other races in our code.
We had to revert and we are now investigating the other races.
Once those are fixed we'll deploy the patch again and let it run for a few days.
I also want to make sure that we do get race warnings when we use the non patched Resty version, just to make sure it still gets triggered.

@jeevatkm
Copy link
Member

@amits1995 Thanks for the update. Please keep me posted as you progress.

I'm gonna make this issue as a bug and applying patch fix in v1.12.0 release in a day or two.

@jeevatkm jeevatkm changed the title Potential race in resty Potential race in resty with combination of retries and the buffers pool Feb 19, 2019
@jeevatkm jeevatkm added bug and removed support labels Feb 19, 2019
@jeevatkm jeevatkm added this to the v1.12.0 Milestone milestone Feb 19, 2019
@moorereason
Copy link
Contributor

A test for this bug would be fantastic.

@hedwigz
Copy link
Author

hedwigz commented Feb 20, 2019

@moorereason I actually tried to reproduce locally but failed
What I did was:

  1. Single resty client
  2. 10 go routines execute 1000 requests in a for loop
  3. Ran a local server that returns 500's (to trigger retries)
  4. Also tried to run regular server that returns 200's
  5. Also tried forcing the use of http2
    Neither worked :\

@jeevatkm
Any chance you could explain the patch?
As far as I understand, in the scope of a single client.execute() there's a buffer release and a single buffer is acquired during the middlewares phase - here
Am I wrong?

@jeevatkm
Copy link
Member

@moorereason @amits1995 I'm not sure how to reproduce the reported scenario. Technically buffer acquire (middleware execution) and buffer release happens within client.execute so race condition should not happen. Honestly, I still have doubt on patch even though I have marked this issue as possible bug and I do not have an answer to myslef "how the patch actually addressing race condition?" becuase moving the buffer release to Request.Execute might introduce inefficiency in utilizing sync.Pool. I will have to think about, what to do with the patch I created.

@moorereason can you also review resty request flow and share your thoughts?

@amits1995 you're understanding on the patch I did is correct.

@moorereason
Copy link
Contributor

I asked about a test because it's not clear to me why the proposed patch fixed the reported issue. I'll try to spend some more time reviewing this.

moorereason added a commit to moorereason/resty that referenced this issue Feb 20, 2019
@moorereason
Copy link
Contributor

I've posted a PR with a failing test case.

@moorereason
Copy link
Contributor

I apologize. My PR is unsafe for other reasons with a completely different stack trace from the original report. My PR only proves that resty.Request is not concurrency-safe.

Back to the drawing board...

@moorereason
Copy link
Contributor

@amits1995, what version of Go produced the original stack trace?

@hedwigz
Copy link
Author

hedwigz commented Feb 20, 2019

@moorereason
It was on Go 1.11
Here's how I tried to reproduce the race but it didn't work... maybe you can find what I was missing
https://gist.github.com/amits1995/501e49e12a83f8340145a72f65bd57f5

@moorereason
Copy link
Contributor

Thanks for the gist, @amits1995. I've been unable to reproduce the original stack trace after iterating through several different approaches. BTW, I don't believe the race detector would give a false positive. There's a bug somewhere.

I found a very similar data race report at golang/go#25009 (comment). A fix has been committed upstream, but you'll have to build with go1.12rc1 to test it.

@hedwigz
Copy link
Author

hedwigz commented Feb 21, 2019

@moorereason Interesting! looks like it might actually be a bug in go http2.
I'll try to reproduce the error in 1.11 and later on will try 1.12rc

@jeevatkm
Copy link
Member

@amits1995 @moorereason Thank you, glad to see Resty community helping each other.

@hedwigz
Copy link
Author

hedwigz commented Feb 23, 2019

@jeevatkm Let's close this issue for now and I'll open again if I see any problems after Go's patch.

@jeevatkm
Copy link
Member

@amits1995 Okay, sure.

@moorereason
Copy link
Contributor

@amits1995, I know we closed this issue, but I'd like to know if Go 1.12, which was released yesterday, resolves this issue for you, so please report back the results.

@hedwigz
Copy link
Author

hedwigz commented Feb 26, 2019

@moorereason no worries, I planned on doing so

@maha51ece-tech
Copy link

how to resolve the issue on resty client getting closed for go routines .my go version go version go1.21.4 darwin/arm64
and using this package "github.com/go-resty/resty/v2"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants