Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Rework network/src/protocol.rs to use libp2p's multiplexing #1517

Closed
tomaka opened this issue Jan 22, 2019 · 7 comments · Fixed by #1796
Closed

Rework network/src/protocol.rs to use libp2p's multiplexing #1517

tomaka opened this issue Jan 22, 2019 · 7 comments · Fixed by #1796
Labels
I7-refactor Code needs refactoring. J1-meta A specific issue for grouping tasks or bugs of a specific category.
Milestone

Comments

@tomaka
Copy link
Contributor

tomaka commented Jan 22, 2019

Right now what happens on the networking layer is:

  • We open a connection to a remote and negotiate an encryption and multiplexing protocol.
  • We open a single substream dedicated to the Substrate-specific protocol.
  • Whenever a message is received by this substream, we propagate its raw bytes to the handler in the network crate. The network crate can send back messages using a method on the NetworkService.

The ping, identify and Kademlia protocols of libp2p use an HTTP3-like scheme, where each request opens a new substream. More precisely: we open a substream, send the request, optionally wait the response, then close the substream.

However, the Substrate-specific protocol is an HTTP1-like scheme, where all the requests get pipelined into a single substream.

Moving this to a more HTTP3-like scheme would simplify the code, as libp2p's multiplexing would handle many details (like timeouts, or interleaving messages).

@tomaka tomaka added this to the 1.0 (final) milestone Jan 22, 2019
@gavofyork gavofyork added the I7-refactor Code needs refactoring. label Jan 22, 2019
@gterzian gterzian added the J1-meta A specific issue for grouping tasks or bugs of a specific category. label Jan 24, 2019
@gterzian
Copy link
Contributor

Tentatively turned this into a "meta" issue, since it sounds like it could be broken-up into smaller issues, and made an attempt a formulating one at #1541.

On the broader issue, I just wanted to add that having protocol use a "HTTP1-like scheme" might be a good thing, in the sense that it just handles one event from libp2p at the time(and communicates back one message at a time) and perhaps doesn't need to be aware of the underlying streaming structure.

On the other hand, it sounds like the way to take advantage of of libp2p's "HTTP3-like" capabilities is to move logic out of protocol entirely, the peer timeout being perhaps a good example.

@gterzian
Copy link
Contributor

gterzian commented Jan 24, 2019

Also, I think it's important to be clear about the difference between the "network" and "network-libp2p" crate, and when @tomaka you mentioned the "Substrate-specific protocol" I assume you mean "network-libp2p" as opposed to Protocol that is inside "network". Above I meant Protocol when I wrote "protocol" :)

For clarity, I meant:

  1. It might be good for Protocol to really just handle one "event" from "network-libp2p" at a time, to keep the simple "actor-like" setup where the internal state is updated sequentially one event at a time. This is unrelated to the networking implementation, but rather about the interface between networking and the rest, with this interface basically being the ServiceEvent handled inside the "network" crate.
  2. It might be good to move stuff out of Protocol into "network-libp2p", which could then be updated to use the latest features of libp2p(the peer timeout mechanism being a prime example).

@tomaka
Copy link
Contributor Author

tomaka commented Jan 24, 2019

It might be good for Protocol to really just handle one "event" from "network-libp2p" at a time, to keep the simple "actor-like" setup where the internal state is updated sequentially one event at a time.

There's no plan in changing that. This is orthogonal to what this issue is about.

@tomaka
Copy link
Contributor Author

tomaka commented Jan 24, 2019

We can continue talking about this change in details, but I find that a bit ridiculous because it's not a very large change in the end. I'm just waiting for #1340 to finally be merged in order to do it.

@tomaka
Copy link
Contributor Author

tomaka commented Jan 24, 2019

Although this change would probably remove half of #1340 unfortunately, as the channel would be handled by libp2p.

@tomaka tomaka mentioned this issue Feb 5, 2019
12 tasks
@tomaka
Copy link
Contributor Author

tomaka commented Feb 7, 2019

There needs to be a transition period so that we don't break the networking entirely.

My plan for this is to introduce a new protocol name.
Instead of proposing only /substrate/dot/0 or /substrate/bbq/0, we would also propose /substrate/multi/dot/0 or /substrate/multi/bbq/0. The version with /multi would have priority when it comes to negotiations.

If we notice that the remote accepts the version with /multi/, then every time we perform a request, we open new substream. If instead the remote doesn't accept it, we send the request on the first available substream.

@tomaka
Copy link
Contributor Author

tomaka commented Feb 7, 2019

Behaviour right now

When two nodes connect, the dialing side opens a Substrate substream. Once done, we report at the API layer that we are connected. The substream is then used by both sides for all communications. If the substream or the connection is closed by either side, we report to the API layer that we are disconnected.

If either node starts denying the other node, it does so by closing the unique substream. If the node is again accepting the other node, it reopens the substream.

New behaviour (backwards-compatible)

When two nodes connect, the dialing side opens a substream. Once done, we report at the API layer that we are connected. If it uses the old protocol name, then operate as previously.

If it uses the new protocol name, then both sides are now free to open as many substreams as they want. If we ever fail to open a substream, we report to the API layer that we are disconnected and immediately shut down all the other substreams without taking their response into account (this is to ensure state consistency, otherwise could report messages after we've reported disconnection).

If a node starts denying the other node, it does so by refusing new incoming substreams. If a node is again accepting the other node again, it forcefully opens a substream. This last sentence is a bit of a hack, but is necessary before #1702 is done.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I7-refactor Code needs refactoring. J1-meta A specific issue for grouping tasks or bugs of a specific category.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants