Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread leak in transportMonitor / resetAddrConn #1402

Closed
snktagarwal opened this issue Jul 28, 2017 · 7 comments
Closed

Thread leak in transportMonitor / resetAddrConn #1402

snktagarwal opened this issue Jul 28, 2017 · 7 comments

Comments

@snktagarwal
Copy link

snktagarwal commented Jul 28, 2017

Please answer these questions before submitting your issue.

What version of gRPC are you using?

1.5.1

What version of Go are you using (go version)?

go version go1.8.1 darwin/amd64

What operating system (Linux, Windows, …) and version?

Mac

What did you do?

If possible, provide a recipe for reproducing the error.
Server running a duplex connection. Simple connection establishment, and tear down by client.

Here's the skeleton of the code:

func (s *speechDataServer) SpeechProxyRecognize(stream pb.SpeechProxyService_SpeechProxyRecognizeServer) error {
	msg0, _ := stream.Recv()

	log.Printf("Stream initiated conf: %+v\n", conf)

	return nil
}

What did you expect to see?

No Thread leak

What did you see instead?

Pprof sees this:

14 @ 0x102c2aa 0x103b504 0x103a16c 0x13975c6 0x13a9489 0x1057fb1
#	0x13975c5	vendor/google.golang.org/grpc.(*addrConn).transportMonitor+0xa15	/go/src/vendor/google.golang.org/grpc/clientconn.go:942
#	0x13a9488	vendor/google.golang.org/grpc.(*ClientConn).resetAddrConn.func1+0x1d8	/go/src/vendor/google.golang.org/grpc/clientconn.go:640

Every RPC call shows such a thread leak

@menghanl
Copy link
Contributor

The goroutines you see are goroutines monitoring the underlying connections for each addrConn. Those are not for each RPC, so it's expected that you still see them when the RPC is done.

Did you see those goroutines blocking even after the ClientConn is closed?

@snktagarwal
Copy link
Author

IIUC you're suggesting that RPC tear down may be happening properly; we're not tearing down the backbone (i.e. the connection sockets) on which these RPCs are running?

Is this something that the server should do or client should initiate. We have a bidirectional RPC.

@menghanl
Copy link
Contributor

I'm not sure about what you want. I think the RPC finishes as expected, the goroutine you saw is not related to the RPC.
Did you see one more goroutine created and blocked for each RPC?

To explain more on the goroutine you saw:
We maintain a connection pool on the client side to send RPCs, as part of ClientConn. When sending RPCs, one connection will be picked from the pool to use. When the RPC finishes, we don't close the connection because it can be used by other RPCs.
The goroutine you saw is bound with the connection, to make sure we always have a working connection to use.
The goroutines will exit when the users decide to close the ClientConn.

@snktagarwal
Copy link
Author

Thanks for the detailed explanation.

Two comments:

  1. We see one more such goroutine (on server) for each RPC that is being called. The number keeps getting higher w
  2. Can you expand on "Users decide to close the ClientConn". If this a teardown the client needs to initiate? When does this happen?

@breznik
Copy link

breznik commented Jul 29, 2017

Does the same apply to the NodeJS gRPC client in terms of a pool of underlying transport clients that are handling the individual rpc calls? Is there a best practice on how to terminate a bidirectional stream between a nodejs gRPC client and a golang gRPC server?

@menghanl
Copy link
Contributor

It's not expected to see those goroutines on the server side. These should be client side only.
In your service handler, do you create a new ClientConn and send new RPCs? If yes, did you close that ClientConn?

"Users decide to close the ClientConn" means when user calls ClientConn.Close()

I'm not sure about the behavior in a NodeJS client, but I would expect it to be some similar to go.

We currently don't a best practice doc. But there's nothing special I can think of about terminating a bidirectional stream. The routeguide example we have should already cover everything.

@dfawley
Copy link
Member

dfawley commented Aug 24, 2017

Please comment if you still believe this is a leak after making sure to close the clientconns properly.

@dfawley dfawley closed this as completed Aug 24, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Sep 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants