Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow all RPC messages on disconnect #5876

Merged
merged 2 commits into from
Jun 7, 2024

Conversation

AgeManning
Copy link
Member

This PR makes it more permissive for outbound RPC messages to reach the application layer.

Previously, RPC requests were not getting stream terminations in race conditions between a peer disconnecting and the RPC handler propagating the message up.

This PR now allows messages to pass up to the application layer in the events that we start disconnecting from a peer. This will even allow slower messages to be sent to the application layer after peers have been disconnected.

This should help resolve further sync lookup bugs.

@AgeManning AgeManning added the v5.2.0 Q2 2024 label Jun 2, 2024
@AgeManning AgeManning requested review from jxs and dapplion June 2, 2024 05:09
};
if !self.peer_manager().is_connected(&peer_id)
// Do not permit Inbound events from peers that are being disconnected
&& matches!(event.event, HandlerEvent::Err(HandlerErr::Inbound { .. }))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to suppress Inbound errors? Is there any known scoring issue that can happen here? We should document why if that's the case

@dapplion
Copy link
Collaborator

dapplion commented Jun 2, 2024

@AgeManning we can move the is_connected check to the main match. I am not convinced of having to ignore inbound errors. I've this commit moving the check to the match + only ignoring inbound requests 32c8fce

@AgeManning
Copy link
Member Author

Yeah fair point. I don't recall off the top of my head, I imagine there are few sneaky edge cases, like peers instantly dropping connections or something when we send a goodbye.

Your commit does look cleaner, but I would test it before merging it in here. You can follow this guide: https://github.com/sigp/lighthouse/tree/stable/testing/network_testing

To run a fresh node on mainnet without syncing or anything and see if it has any weird errors. If not, lets do it.

michaelsproul added a commit that referenced this pull request Jun 3, 2024
Squashed commit of the following:

commit 6d42ebc
Author: Age Manning <Age@AgeManning.com>
Date:   Fri May 31 20:21:26 2024 +1000

    Permit rpc messages on disconnect
@michaelsproul michaelsproul mentioned this pull request Jun 3, 2024
jimmygchen added a commit that referenced this pull request Jun 3, 2024
Squashed commit of the following:

commit 6d42ebc
Author: Age Manning <Age@AgeManning.com>
Date:   Fri May 31 20:21:26 2024 +1000

    Permit rpc messages on disconnect
michaelsproul pushed a commit that referenced this pull request Jun 4, 2024
Squashed commit of the following:

commit 6d42ebc
Author: Age Manning <Age@AgeManning.com>
Date:   Fri May 31 20:21:26 2024 +1000

    Permit rpc messages on disconnect
Comment on lines 1383 to 1384
&& matches!(event.event, HandlerEvent::Err(HandlerErr::Inbound { .. }))
&& matches!(event.event, HandlerEvent::Ok(RPCReceived::Request(..)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this condition only hits if event.event is both an Err and Ok at the same time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing the intent was && should be ||

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed this change

michaelsproul added a commit that referenced this pull request Jun 6, 2024
Squashed commit of the following:

commit de97efc
Author: Michael Sproul <michael@sigmaprime.io>
Date:   Thu Jun 6 12:13:28 2024 +1000

    Use or instead of and

commit 6d42ebc
Author: Age Manning <Age@AgeManning.com>
Date:   Fri May 31 20:21:26 2024 +1000

    Permit rpc messages on disconnect
@michaelsproul michaelsproul force-pushed the permit-rpc-on-disconnect branch from c4f7811 to de97efc Compare June 6, 2024 06:57
@michaelsproul
Copy link
Member

I pushed a commit here and then force pushed it out of existence, because I misread some logs on our infra thinking that the Ignoring rpc message of disconnecting peer was being logged for outbound messages. This was only the case on 3 hosts that weren't actually running this branch 🙃 So all is well

@michaelsproul
Copy link
Member

@Mergifyio queue

Copy link

mergify bot commented Jun 7, 2024

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at 22fe0a6

mergify bot added a commit that referenced this pull request Jun 7, 2024
@mergify mergify bot merged commit 22fe0a6 into sigp:unstable Jun 7, 2024
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v5.2.0 Q2 2024
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants