Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for idle connection timeouts and automatic reconnection #16

Closed
wants to merge 13 commits into from

Conversation

estensen
Copy link
Contributor

@estensen estensen commented Mar 4, 2025

⚠️ Abandoning this in favor of #19

Like http.Client we don't set timeouts by default, which can lead to connection leaks and silent disconnections in long-running applications.

This PR allows IdleTimout to be set to configure the timeout duration. Connections that are broken are also automatically reconnected.

To keep things backwards compatible IdleTimout is set to 0 and is disabled by default.

Added test to simluate network and will add it to CI next:

> FIBER_API_KEY="XXX" RUN_RECONNECTION_TEST=true go test -v -run TestReconnection
RUN_RECONNECTION_TEST=true go test -v -run TestReconnection
=== RUN   TestReconnection
    smoke_test.go:31: ========== CONNECTION SETUP ==========
    smoke_test.go:61: Waiting for initial transactions before testing reconnection...
    smoke_test.go:75: Received initial tx 1: 0xbaf0d73d5c424c7a00796938ae6148adf6fa3fde7e80a617776f8c6b98b3cecf
    smoke_test.go:77: Received sufficient initial transactions
    smoke_test.go:86: Current connection state before disconnect: READY
    smoke_test.go:89: Simulating network interruption...
    smoke_test.go:94: Waiting for reconnection...
Stream error, reconnecting: rpc error: code = Canceled desc = grpc: the client connection is closing
Subscription error, retrying in 1.46767385s: rpc error: code = Canceled desc = grpc: the client connection is closing
Subscription error, retrying in 2.24161844s: rpc error: code = Canceled desc = grpc: the client connection is closing
Subscription error, retrying in 2.640983165s: rpc error: code = Canceled desc = grpc: the client connection is closing
    smoke_test.go:119: RECONNECTION DETECTED after 21.863080708s - Transaction received after disconnect
    smoke_test.go:121: Current connection state after reconnect: READY
    smoke_test.go:125: Post-reconnect tx: 0x7c65daf613dc42da4db0252c19e25baaac1a4fee3bd5a6976446224a5d5e63df
    smoke_test.go:125: Post-reconnect tx: 0x8e26e2bd6796dd4953cc11fc1fc11183082f7ccda4bb61ca0f5efec044947e47
    smoke_test.go:125: Post-reconnect tx: 0xf355b293674e8670eeb736c5a271bbf40064cc21fa47d0f5d609a8a4c230658c
    smoke_test.go:130: Successfully confirmed reconnection with 3 transactions
    smoke_test.go:149: SUCCESS: Client reconnected after 21.863080708s
--- PASS: TestReconnection (22.62s)
Stream error, reconnecting: rpc error: code = Canceled desc = grpc: the client connection is closing
PASS
ok      github.com/chainbound/fiber-go  22.869s

@estensen estensen requested a review from mempirate March 4, 2025 15:23
@mempirate
Copy link
Contributor

@estensen can we double check that this will be compatible with the Fiber grpc server? https://github.com/grpc/grpc-go/blob/master/Documentation/keepalive.md

@estensen
Copy link
Contributor Author

estensen commented Mar 5, 2025

@estensen can we double check that this will be compatible with the Fiber grpc server? https://github.com/grpc/grpc-go/blob/master/Documentation/keepalive.md

To add more context here the concern here is that the server might disconnect clients that send keepalive too often
https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md#server-enforcement

I've reviewed the Fiber server and Tonic and can't see that it will disconnect clients because of keepalives.
Will create a small program and smoke-test this before merging.

@estensen estensen changed the title feat: add support for idle connection timeouts feat: add support for idle connection timeouts and automatic reconnection Mar 5, 2025
streams.go Outdated

// Use exponential backoff with jitter
sleepTime := backoff + time.Duration(rand.Int63n(int64(backoff/2)))
fmt.Printf("Subscription error, retrying in %v: %v\n", sleepTime, err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably implement a logger instead of this. But can we live with it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have gotten feedback that people don't like it, so let's rm it for now and do a logger later

Copy link
Contributor

@mempirate mempirate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job, some questions and nits

if err == nil {
// Successfully created new connection
c.conn = newConn
c.client = api.NewAPIClient(newConn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this also reconnect any existing streams that are open? Or should we do that manually?


// Use exponential backoff with jitter
sleepTime := *backoff + time.Duration(rand.Int63n(int64(*backoff/2)))
fmt.Printf("%s subscription error, retrying in %v: %v\n", subscriptionName, sleepTime, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Printf("%s subscription error, retrying in %v: %v\n", subscriptionName, sleepTime, err)

break
}

fmt.Printf("Stream error, reconnecting: %v\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Printf("Stream error, reconnecting: %v\n", err)

time.Sleep(time.Second * 2)
continue outer
if err == io.EOF {
fmt.Println("Stream completed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Println("Stream completed")

if err := tx.UnmarshalBinary(proto.RlpTransaction); err != nil {
continue outer
if err := tx.UnmarshalBinary(msg.RlpTransaction); err != nil {
fmt.Printf("Error decoding transaction: %v\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Printf("Error decoding transaction: %v\n", err)

time.Sleep(time.Second * 2)
continue outer
if err == io.EOF {
fmt.Println("Stream completed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Println("Stream completed")

break
}

fmt.Printf("Block stream error, reconnecting: %v\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Printf("Block stream error, reconnecting: %v\n", err)

}

if decodeErr != nil {
fmt.Printf("Failed to decode execution payload: %v\n", decodeErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Printf("Failed to decode execution payload: %v\n", decodeErr)

time.Sleep(time.Second * 2)
continue outer
if err == io.EOF {
fmt.Println("Stream completed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Println("Stream completed")

break
}

fmt.Printf("Beacon block stream error, reconnecting: %v\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Printf("Beacon block stream error, reconnecting: %v\n", err)

@estensen estensen mentioned this pull request Mar 5, 2025
@estensen estensen marked this pull request as draft March 5, 2025 20:24
@estensen estensen closed this Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants