-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/dcrdex: Unable to stop app after shutting down dcrd node. #321
Comments
I've seen this too. Probably start looking around dcrdex/client/asset/btc/btc.go Lines 346 to 347 in a83ce99
@JoeGruffins you want to investigate? |
sure! |
Part of updating our dcrd/rpcclient dependency will be using the new context.Context input arg. That might be related if it's an RPC that hangs. That won't help with btc though since I don't think it's rpcclient is updated. |
I have tracked this down as being a function of our rpcclient. The dcrd backend polls for best blocks: dcrdex/server/asset/dcr/dcr.go Line 451 in 6778508
When there is no connection, dcrd does not return an error when auto-reconnect is on. It saves all requests to fire them all at once when a reconnect happens. And so it stops there when not connected and never makes to the This issue can be solved by turning off autoreconnect here: dcrdex/server/asset/dcr/dcr.go Lines 768 to 774 in 6778508
by adding the line DisableAutoReconnect: true,
We "solved" a similar issue in dcrstakpool by writing a reconnect function. jrick also has a convenient package https://github.com/jrick/wsrpc that I think simplifies things. However, we would still need to handle reconnects manually. Also, the client does reconnect and shutdown continues with auto-reconnect on when using testnet. For some reason it does not seem to work the same when using the simnet harness.
Or maybe this? Not sure what it is yet, but passing a context and having the client stop waiting on ctx.Done() would also solve the issue. |
PR #325 looks good to fix the hang, but let's open an issue or two for: (1) running with auto-reconnect disabled to prevent RPC calls from simply hanging, and (2) using the newer rpcclient API that takes |
@JoeGruffins Assuming we stick with rpcclient, could you investigate the two approaches I named? What are the pros/cons of autoreconnect, and does the Context-enabled rpcclient API effectively fix the issues without requiring manual reconnect code? |
Is the rpcclient that uses a context dcrd/rpcclient/v6 ? If so I don't see where it has changed to use a context throughout. |
There's a // GetBestBlockHash returns the hash of the best block in the longest block
// chain.
func (c *Client) GetBestBlockHash(ctx context.Context) (*chainhash.Hash, error) {
return c.GetBestBlockHashAsync(ctx).Receive()
} Currently that's |
It looks like that context doesn't concern websocket connections, only http. Will test out v6 anyway to see if canceling that ctx has the desired effect for us. |
In https://github.com/decred/dcrd/blob/ce2195fbc3de0ee60ebaebfe7a967f0d8b041498/rpcclient/infrastructure.go#L668 it looks like that ctx is ignored until a reconnect is achieved. |
Ohhhh, that's unfortunate. Seems like if the context is cancelled then it should cease reconnection attempts. |
rpcserver/v6 should now be able to handle this better. decred/dcrd#2198 I guess closing this issue will depend on v6 being finalized, and us using it. |
If you shut down your dcrd node/asset before shutting down the server, the server hangs on shutdown. The logs stop here:
Starting the node back up does not seem to help.
The text was updated successfully, but these errors were encountered: