-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(swarm)!: Allow NetworkBehaviour
s to manage incoming connections
#3099
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
The answer also depends on what |
BTW: |
That is right! This is more tricky than I thought. If I wanted to inject a |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
My primary worry right now is not the structure of IntoConnectionHandler, so I don’t have good inputs there, but the proposed DialSource enum sounds very useful to me! |
I just had another idea which I think is much better: The reason for passing a Previously, that required really complex tracking of state in the behaviour because of the If we remove this abstraction, it is really easy for That is functionally equivalent to what we have today and makes several APIs a lot simpler. |
This comment was marked as outdated.
This comment was marked as outdated.
NetworkBehaviour::DialPayload
IntoConnectionHandler
abstraction
IntoConnectionHandler
abstractionIntoConnectionHandler
abstraction
swarm/src/lib.rs
Outdated
let supported_protocols = todo!(); | ||
|
||
// let supported_protocols = self | ||
// .behaviour | ||
// .new_handler() | ||
// .inbound_protocol() | ||
// .protocol_info() | ||
// .into_iter() | ||
// .map(|info| info.protocol_name().to_vec()) | ||
// .collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This turns out to be difficult.
@mxinden I remember you saying that PollParameters
was always something that didn't appeal to you. Did you ever think about alternative design? This might be a good time to bring them up :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am now initialising these to an empty vector and they are updated every time a new connection is made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mxinden I remember you saying that
PollParameters
was always something that didn't appeal to you.
Yes, very much. I would like to not have PollParameters
.
I am now initialising these to an empty vector and they are updated every time a new connection is made.
In my eyes that is the better behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// | ||
/// # Example carrying state in the handler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we replace this example? Now that new_handler
gives you a PeerId
, I think it is quite easy to realize that you can store data for a future connection in the behaviour and just pass it into the handler here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NetworkBehaviour
s can track state for future connections within themselves (indexed byPeerId
) and pass it into theConnectionHandler
upon construction. This removes the need for passing along a handler inNetworkBehaviourAction::Dial
.
Say there are two NetworkBehaviour
implementations, each requesting a new connection to a new peer. In such case, when NetworkBehaviour::inject_connection_established
is called, neither of them knows whether this new connection corresponds to their dial request.
We could still allow the user to provide user data via DialOpts
. For the case of Swarm::dial
and NetworkBehaviour::inject_dial_failure
we could wrap this in an Option
.
I am not sure whether we need to design for the above race condition. Your proposal might be just fine. What do folks think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ipfs-embed
I validate advertised peer addresses by dialling them — and closing such connections upon success. This is only reliably possible if I can detect whether the connection resulted from my own dialling attempt. OTOH, another cause may have resulted in dialling that same peer with that same address for other reasons, which would then possibly have said “nah, attempt is already underway”. But all of this would happen in addition to existing connections, so it might not be a huge problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to actively close them? Assuming reasonable keep_alive
implementations of your ConnectionHandler
s, the connection should get closed automatically if it is not in use.
OTOH, another cause may have resulted in dialling that same peer with that same address for other reasons, which would then possibly have said “nah, attempt is already underway”.
Can this be expressed with PeerCondition::NotDialing
?
Note that new_handler
also gives you access to ConnectedPoint
. So in addition to storing data in your behaviour based on PeerId
, you could also index it by the dialed address. new_handler
being called for an address you wanted to validate IS the validation that a connection was made to this address. If you have a dedicated AddressValidationBehaviour
, that behaviour could then straight up deny that connection which instantly closes it again.
This comment was marked as outdated.
This comment was marked as outdated.
TODO:
|
…nied` reason" This reverts commit 19348e9.
eed846b
to
7369cc1
Compare
7369cc1
to
dcb4f96
Compare
Here is a proposal regarding pending connection management:
The The I am reasonably happy with this design. We are only extending the Thoughts @mxinden? |
This pull request has merge conflicts. Could you please resolve them @thomaseizinger? 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much in favor of this! Did not do a full review yet.
pub struct ConnectionClosed<'a, Handler> { | ||
pub peer_id: PeerId, | ||
pub connection_id: ConnectionId, | ||
pub endpoint: &'a ConnectedPoint, | ||
pub handler: <Handler as IntoConnectionHandler>::Handler, | ||
pub handler: Handler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope for this PR, but I do wonder if we really need the handler
field on ConnectionClosed
.
The handler
property was added in #2191 together with adding it to Dial
, DialFailure
and ListenFailure
. If I understand it correctly, the main motivation behind it was:
it allows one to attach state to a dial request, thus not having to keep track of it within a NetworkBehaviour implementation
Per #3099 (comment) attaching a state to a dial is not needed anymore. What's the use-case of still returning the handler in ConnectionClosed
? As far as I can tell, none of our behaviours ever use it.
Motivation behind the question is that it would be great if we could get rid of the duplication between SwarmEvent
and FromSwarm
.
Until now the main difference was that FromSwarm
also contains the handler, which should not be exposed in the SwarmEvent
1: #2423 (comment). If we could remove it here, we might be able to do something like:
enum SwarmEvent<TBehaviourOutEvent> {
Behaviour(TBehaviourOutEvent),
FromSwarm(FromSwarmEvent)
}
Footnotes
-
There are also some other differences;
FromSwarm
containsConnectionId
s, and theNewExternalAddr
andExpiredExternalAddr
variants. But I don't see see a drawback in exposing them to the user. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related: #3046
Any ideas on how to make this less breaking would be gladly appreciated! :) |
One slight variation on this: |
I think we can limit the breaking changes of this PR to removing the handler from |
@@ -96,6 +97,10 @@ | |||
|
|||
- Update `rust-version` to reflect the actual MSRV: 1.62.0. See [PR 3090]. | |||
|
|||
- Remove `IntoConnectionHandler` abstraction and change the signature of `NetworkBehaviour::new_handler` to accept `PeerId` and `ConnectedPoint`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of a small before and after code snippet to help folks upgrade?
fn new_handler(&mut self) -> Self::ConnectionHandler; | ||
fn new_handler( | ||
&mut self, | ||
peer: &PeerId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peer: &PeerId, | |
peer: PeerId, |
Why provide a reference to a Copy
type?
Sounds good to me.
Do I understand correctly that both methods would have the same signature and that we don't foresee the two signatures to diverge any time soon? If so, I suggest not splitting it. While nice for the sake of consistency, I don't think it justifies the additional complexity. |
@@ -96,6 +99,8 @@ pub struct Client { | |||
/// connection. | |||
directly_connected_peers: HashMap<PeerId, Vec<ConnectionId>>, | |||
|
|||
initial_events: HashMap<PeerId, handler::In>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial_events: HashMap<PeerId, handler::In>, | |
/// [`handler::In`] event to be provided to a handler in `NetworkBehaviour::new_handler` once the corresponding connection is established. | |
initial_events: HashMap<PeerId, handler::In>, |
What do you think?
if let Some(event) = self.initial_events.remove(peer) { | ||
#[allow(deprecated)] | ||
handler.inject_event(event) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Injecting events through two ways, i.e. (1) at creation and (2) regularly during the lifetime of the handler via Swarm
, seems inconsistent to me. That said, I can not think of a better way. I suggest keeping as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could hold the event back here and instead listen for the ConnectionEstablished
event and then emit an event later?
log::debug!( | ||
"Established relayed instead of direct connection to {:?}, \ | ||
dropping initial in event {:?}.", | ||
peer, | ||
event |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know that there is no second direct connection in progress of establishment to the given peer? How do we know that this event was predetermined for this connection and not another connection still being established?
This is rather complex. I rewrote this comment 4 times. I might be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can index the HashMap
by the dialed address instead of the peer ID. That should fix things I guess?
self.initial_events | ||
.insert(relay_peer_id, handler::In::Reserve { to_listener }); | ||
|
||
NetworkBehaviourAction::Dial { | ||
opts: DialOpts::peer_id(relay_peer_id) | ||
.addresses(vec![relay_addr]) | ||
.extend_addresses_through_behaviour() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that some NetworkBehaviour
might add a relayed address here.
The relay protocol should never be spoken on top of a relayed connection. We should never do nested relaying as otherwise one could build circular relayed connections that could be misused for a DOS attack. Thus returning a dummy::ConnectionHandler
on relayed connections is critical.
That said, we might be establishing both a relayed and a direct connection to a given peer in parallel. How do we know which one the initial_events
event was destined for?
Not sure how to prevent this. The solution you do above, namely to drop the initial_events
event in case it turns out to be a relayed connection covers the case where there is only a single connection attempt in progress. What about the case where both (relayed and direct) are in progress?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only 1-to-1 ported the code that already existed, as far as I know, no functionality should currently change in this PR.
If split, I would inline the variants of the rust-libp2p/core/src/connection.rs Line 116 in f8f19ba
So no, the signatures would not be the same. |
Sorry for the delay. As far as lighthouse is concerned, this would work really well in particular having the Also of course, thanks for all these efforts! |
@mxinden: Instead of a new |
I think with that idea, we might be able to ship the removal of |
Turns out we can't but overall the version in #3254 is a lot less breaking which is nice :) |
Description
Previously,
ConnectionHandler
employed the prototype pattern via theIntoConnectionHandler
abstraction. This allowed theSwarm
to construct an instance ofIntoConnectionHandler
before the connection was established. This abstraction is however unnecessary. We can model all existing usecases by delaying the call toNetworkBehavour::new_handler
until the connection is established. Not only does this delete a lot of code, it also makes several APIs simpler:NetworkBehaviour
s can track state for future connections within themselves (indexed byPeerId
) and pass it into theConnectionHandler
upon construction. This removes the need for passing along a handler inNetworkBehaviourAction::Dial
.Handler
fromNetworkBehaviourAction::Dial
also avoids the need to pass the handler back into the behaviour ininject_dial_failure
.NetworkBehaviour::new_handler
fallible,NetworkBehaviour
s can implement almost arbitrary connection management policies by denying the construction of aConnectionHandler
for a newly established connection.Resolves #2824.
Links to any relevant issues
Open Questions
Change checklist