From 1043853b6fe5535c8c3c0b1badbd217058bf1b10 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Mon, 8 Oct 2018 18:22:21 -0700 Subject: [PATCH 01/15] [WIP] [RFC] Multistream-2.0 Here's a draft of multistream-2.0 (+ a retrospective that you can skip). --- multistream-2.0/spec.md | 291 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 291 insertions(+) create mode 100644 multistream-2.0/spec.md diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md new file mode 100644 index 000000000..613270acc --- /dev/null +++ b/multistream-2.0/spec.md @@ -0,0 +1,291 @@ +# Efficient and Sound Protocol Negotiation + +This proposal lays out issues that we've encountered with multistream-select and +proposes a replacement protocol. It also lays out an upgrade path from +multistream-select to the new protocol. + +## Retrospective + +There are 5 concrete issues with multistream select. + +multistream-select: + +1. requires at least one round trip to be sound. +2. negotiates protocols in series instead of in parallel. +3. doesn't provide any way to determine which side (possibly both) initiated the + connection/negotiation. +4. is bandwidth inefficient. +5. punishes long, descriptive, protocol names. + +We ignore 1 and just accept that the protocol has some soundness issues as +actually *waiting* for a response for a protocol negotiation we know will almost +certainly succeed would kill performance. + +As for 2, we make sure to remember protocols known to be spoken by the remote +endpoint so we can try to negotiate a known-good protocol first. However, this +is still inefficient. + +Issue 3 gets us in trouble with TCP simultaneous connect. Basically, we need a +protocol where both sides can propose a set of protocols to speak and then +deterministically select the *same* protocol. Ideally, we'd also *expose* the +fact that both sides are initiating to the user. + +By 4, I mean that we repeatedly send long strings (the protocol names) back and +forth. While long strings *are* more user friendly than, e.g., port numbers, +they're, well, long. This can introduce bandwidth overheads over 30%. + +Issue 5 is a corollary of issue 4. Because we send these protocol names *every* +time we negotiate, we don't, e.g., send longer, better protocol names like: + +* /ai/protocol/p2p/bitswap/1.0 +* /ipfs/QmId.../bitswap/1.0 + +However, multistream-select was *explicitly designed* with this use-case in +mind. + +## Protocols + +This document proposes 5 new, micro-protocols with two guiding principles: + +1. Composition over complexity. +2. Every byte and round-trip counts. + +The protocols are: + +1. `multistream/use`: declares the protocol being used using a multicodec. +2. `multistream/dynamic`: declares the protocol being used using a string. +3. `multistream/contextual`: declares the protocol being used using an ephemeral + protocol ID defined by the *receiver* for the duration of some session (e.g., + an underlying connection). +4. `multistream/choose`: allows an initiator to optimistically initiate multiple + streams, discarding all but one. +5. `multistream/hello`: inform the remote end about (a) which protocols we speak + and (b) which services we are running. This should replace both identify and + our current "try every protocol" service discovery system. + +This document also proposes an auxiliary protocols that we'll need to complete +the picture. + +1. `serial-multiplex`: a simple stream "multiplexer" that can multiplex multiple + streams *in serial* over the same connection. That is, it allows us to return + to the stream to multistream once we're done with it. This allows us to *try* + a protocol, fail, and fallback on a slow protocol negotiation. + +All peers *must* implement `multistream/use` and *should* implement +`serial-multiplex`. This combination will allow us to apply a series of quick +connection upgrades (e.g., to multistream 3.0) with no round trips and no funny +business (learn from past mistakes). + +These protocols were *also* designed to eventually support: + +1. Hardware. While we *do* use varints, we avoid using them for lengths in the + fast-path protocols (the non-negotiating ones). +2. Packet protocols. All protocols described here are actually unidirectional + (at the protocol level, at least) and can work over packet protocols (where + the end of the packet is an "EOF"). + +Notes: + +1. The "ls" feature of multistream has been removed. While useful, this really + should be a *protocol*. Given the `serial-multiplex` protocol, this shouldn't be + an issue. +2. To reduce RTTs, all protocols are unidirectional. Even the negotiation + protocol, `multistream/choose` (see below for details). + +### Multistream Use + +The multistream/use protocol is simply two varint multicodecs: the +multistream-use multicodec followed by the multicodec for the protocol to be +used. This protocol is *unidirectional*. If the stream is bidirectional, the +receiver will, by convention, respond the same way. + +Every stream starts with multistream-use. Every other protocol defined here will +be assigned a multicodec and selected with multistream/use. + +This protocol should *also* be trivial to optimize in hardware simply by prefix +matching (i.e., matching on the first N (usually 16-32) bits of the +stream/message). + +* [ ] Q: Technically, the first multicodec is redundant. However, it acts as a + magic byte that allows us to figure out what's going on. Should we keep + it? We could just start all streams with a single multicodec representing + the protocol +* [ ] Q: Should we somehow distinguish between initiator and receiver? Should we + distinguish between bidirectional and unidirectional? We could even bit + pack these options into a single byte and use this instead of the leading + multicodec... Note: distinguishing between bidirectional and + unidirectional may actually be necessary to be able to eagerly send a + unidirectional `multistream/hello` message. + +### Multistream Dynamic + +The multistream/dynamic protocol is like the multistream/use protocol *except* +that it uses a string to identify the protocol. To do so, the initiator simply +sends a fixed-size 16bit length followed by the name of the protocol. + +Including the multistream/use portion, the initiator would send: + +``` + +``` + +Design Note: We *could* use a varint and save a byte in many cases however: + +1. We'd either need to buffer the connection or read the varint byte-by-byte. + Neither of those are really optimal. +2. The length of the name will be dwarf this extra byte. +3. If anyone needs a 64byte name, they're using the *wrong protocol*. Really, + a single byte length should be sufficient for all reasonable protocol names + but we're being stupidly conservative here. + +### Multistream Contextual + +The multistream/contextual protocol is used to select a protocol using a +*receiver specified*, session-ephemeral protocol ID. These IDs are analogues of +ephemeral ports. + +In this protocol, the stream initiator sends a 16 bit ID specified by the +*receiver* to the receiver. This is a *unidirectional* protocol. + +Format: + +``` + +``` + +The ID 0 is reserved for saying "same protocol" on a bidirectional stream. The +receiver of a bidirectional stream can't reuse the same contextual ID that the +initiator used as this contextual ID is relative *to* the receiver. Really, this +last rule *primarily* exists to side-step the TCP simultaneous connect issue. + +This protocol has *also* been designed to be hardware friendly: + +1. Hardware can compare the first 16 bits of the message against + ``. +2. It can then route the message based on the contextual ID. The fact that these + IDs are chosen by the *receiver* means that the receiver can reuse the same + IDs for all connected peers (reusing the same hardware routing table). + +* [ ] TODO: Just use a varint? Hardware can still do prefix matching and/or only +support IDs that are at most two bytes long. + +### Multistream Choose + +**WARNING:** this may be too complex/magical. However, it has some really nice +properties. We could also go with a more standard I send you a list of protocols +and you pick one approach but I'd like to consider this one. + +The multistream/choose protocol allows an initiator to start multiple streams in +parallel while telling the receiver to only *act* on one of them. This: + +1. Allows us to "negotiate" each stream using the other multistream protocols. + That is, each message/sub-stream recursively uses multistream. +2. Pack data into the initial packet to shave off a RTT in many cases. +3. Support packet transports out of the box. + +Each message in this protocol consists of: + +``` + + +``` + +The initiator can transition to a single one of these streams by sending: + +``` + +0 +``` + +This effectively aborts all the other streams, allowing the chosen stream to +completely take over the channel. + +To actually *select* a protocol on a bidirectional channel, the receiver simply +uses one of the other multistream protocols to pick a protocol. + +Note: A *simple* implementation of this protocol would simply send a sequence of +protocols as `......` and then wait for the other side to select the +appropriate protocol. + + +* [ ] Q: The current framing system is dead simple but inefficient in some + cases. Specifically, one can't just (a) read a *single* header and then + (b) jump to the desired sub-stream. Alternatives include: + * Have a single header that maps stream numbers to offsets and lengths. This + way, one could jump to the correct section immediately. + * Have a single list of "sections", no stream numbers. Stream numbers would be + inferred by index. This is slightly smaller but not very flexible. +* [ ] Q: Avoid varints? +* [ ] Q: Just do something simpler? + +### Multistream Hello + +Unspeced (for now). Really, we just need to send a mapping of protocol +names/codecs to contextual IDs (and may be some service discovery information). +Basically, identify. + +### Serial Multiplex + +The `serial-multiplex` protocol is the simplest possible stream multiplexer. +Unlike other stream multiplexers, `serial-multiplex` can only multiplex streams +in *serial*. That is, it has to close the current stream to open a new one. Also +unlike most multiplexers, this multiplexer is *unidirectional*. + +The protocol is: + +``` +
+ +``` + +Where the header is: + +* -2 - Send a reset and return to multistream. All queued data (remote and +* -1 - Close: Send an EOF and return to multistream. +* 0 - Rest: Ends the reuse protocol, transitioning to a direct stream. + local) should be discarded. +* >0 - Data: The header indicates the length of the data. + +We could also use a varint but it's not really worth it. The 16 bit integer +makes implementing this protocol trivial, even in hardware. + +Why: This allows us to: + +1. Try protocols and fall back on others (we can also use `multistream/choose` + for this). +2. More importantly, it allows us to speak a bunch of protocols before setting + up a stream multiplexer. Specifically, we can use this for + `multistream/hello` to send a hello as early as possible. + +## Upgrade Path + +#### Short term + +The short-term plan is to first negotiate multistream 1.0 and *then* negotiate +an upgrade. That is, on connect, the *initiator* will send: + +``` +/multistream/1.0.0\n +/multistream/2.0.0\n +``` + +As a batch. It will then wait for the other side to respond with either: + +``` +/multistream/1.0.0\n +na\n +``` + +in which case it'll continue using multistream 1.0, or: + +``` +/multistream/1.0.0\n +/multistream/2.0.0\n +``` + +in which case it'll switch to multistream 2.0. + +Importantly: When we switch to multistream 2.0, we'll tag the connection (and +any sub connections) with the multistream version. This way, we never have to do +this again. From c92c58cabe970d624f7845f773a1a63a9338d3b2 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Wed, 10 Oct 2018 15:52:58 +0100 Subject: [PATCH 02/15] second iteration of the multistream-2.0 spec 1. Split retro into a separate file. 2. Add an example. 3. Rename hello to advertise. I'd like to encourage using many tiny protocols instead of adding more and more features to bloated protocols. That means separate protocol advertisement, hello, etc. protocols. Maybe this is going too far and we should just call this "identify". 4. Rename serial-multiplex to serial-stream and multistream/choose to speculative-stream (and move it out of the multistream protocol family). 5. Use more varints. Really, we can probably go a step further and make serial-multiplex use a varint. --- multistream-2.0/retrospective.md | 41 ++++ multistream-2.0/spec.md | 350 ++++++++++++++++++++----------- 2 files changed, 272 insertions(+), 119 deletions(-) create mode 100644 multistream-2.0/retrospective.md diff --git a/multistream-2.0/retrospective.md b/multistream-2.0/retrospective.md new file mode 100644 index 000000000..780fbcb3b --- /dev/null +++ b/multistream-2.0/retrospective.md @@ -0,0 +1,41 @@ +# Multistream-Select 1.0.0 Retrospective + +This short document aims to motivate the need for a new stream negotiation +protocol. + +There are 5 concrete issues with multistream select. + +multistream-select: + +1. requires at least one round trip to be sound. +2. negotiates protocols in series instead of in parallel. +3. doesn't provide any way to determine which side (possibly both) initiated the + connection/negotiation. +4. is bandwidth inefficient. +5. punishes long, descriptive, protocol names. + +We ignore 1 and just accept that the protocol has some soundness issues as +actually *waiting* for a response for a protocol negotiation we know will almost +certainly succeed would kill performance. + +As for 2, we make sure to remember protocols known to be spoken by the remote +endpoint so we can try to negotiate a known-good protocol first. However, this +is still inefficient. + +Issue 3 gets us in trouble with TCP simultaneous connect. Basically, we need a +protocol where both sides can propose a set of protocols to speak and then +deterministically select the *same* protocol. Ideally, we'd also *expose* the +fact that both sides are initiating to the user. + +By 4, I mean that we repeatedly send long strings (the protocol names) back and +forth. While long strings *are* more user friendly than, e.g., port numbers, +they're, well, long. This can introduce bandwidth overheads over 30%. + +Issue 5 is a corollary of issue 4. Because we send these protocol names *every* +time we negotiate, we don't, e.g., send longer, better protocol names like: + +* /ai/protocol/p2p/bitswap/1.0 +* /ipfs/QmId.../bitswap/1.0 + +However, multistream-select was *explicitly designed* with this use-case in +mind. diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 613270acc..36f513bb6 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -1,106 +1,78 @@ -# Efficient and Sound Protocol Negotiation +# Multistream 2.0 -This proposal lays out issues that we've encountered with multistream-select and -proposes a replacement protocol. It also lays out an upgrade path from -multistream-select to the new protocol. - -## Retrospective - -There are 5 concrete issues with multistream select. - -multistream-select: - -1. requires at least one round trip to be sound. -2. negotiates protocols in series instead of in parallel. -3. doesn't provide any way to determine which side (possibly both) initiated the - connection/negotiation. -4. is bandwidth inefficient. -5. punishes long, descriptive, protocol names. - -We ignore 1 and just accept that the protocol has some soundness issues as -actually *waiting* for a response for a protocol negotiation we know will almost -certainly succeed would kill performance. - -As for 2, we make sure to remember protocols known to be spoken by the remote -endpoint so we can try to negotiate a known-good protocol first. However, this -is still inefficient. - -Issue 3 gets us in trouble with TCP simultaneous connect. Basically, we need a -protocol where both sides can propose a set of protocols to speak and then -deterministically select the *same* protocol. Ideally, we'd also *expose* the -fact that both sides are initiating to the user. - -By 4, I mean that we repeatedly send long strings (the protocol names) back and -forth. While long strings *are* more user friendly than, e.g., port numbers, -they're, well, long. This can introduce bandwidth overheads over 30%. - -Issue 5 is a corollary of issue 4. Because we send these protocol names *every* -time we negotiate, we don't, e.g., send longer, better protocol names like: - -* /ai/protocol/p2p/bitswap/1.0 -* /ipfs/QmId.../bitswap/1.0 - -However, multistream-select was *explicitly designed* with this use-case in -mind. +This proposal describes a replacement protocol for multistream-select. ## Protocols -This document proposes 5 new, micro-protocols with two guiding principles: +This document proposes 6 new, micro-protocols with two guiding principles: 1. Composition over complexity. 2. Every byte and round-trip counts. -The protocols are: - -1. `multistream/use`: declares the protocol being used using a multicodec. -2. `multistream/dynamic`: declares the protocol being used using a string. -3. `multistream/contextual`: declares the protocol being used using an ephemeral - protocol ID defined by the *receiver* for the duration of some session (e.g., - an underlying connection). -4. `multistream/choose`: allows an initiator to optimistically initiate multiple - streams, discarding all but one. -5. `multistream/hello`: inform the remote end about (a) which protocols we speak - and (b) which services we are running. This should replace both identify and - our current "try every protocol" service discovery system. - -This document also proposes an auxiliary protocols that we'll need to complete -the picture. - -1. `serial-multiplex`: a simple stream "multiplexer" that can multiplex multiple - streams *in serial* over the same connection. That is, it allows us to return - to the stream to multistream once we're done with it. This allows us to *try* - a protocol, fail, and fallback on a slow protocol negotiation. +This document *does not*, in fact, propose a protocol *negotiation* protocol. +Instead, it proposes a set of stream/protocol management protocols that can be +composed to flexibly negotiate protocols. + +First, this document proposes 4 protocol "negotiation" protocols. "Negotiation" +is in quotes because none of these protocols actually involve negotiating +anything. + +1. `multistream/advertise`: Inform the remote end about which protocols we speak + and. This should partially replace the current identify protocol. +2. `multistream/use`: Selects the stream's protocol using a multicodec. +3. `multistream/dynamic`: Selects the stream's protocol using a string protocol name. +4. `multistream/contextual`: Selects the stream's protocol using a protocol ID + defined by the *receiver*, valid for the duration of the "session" + (underlying connection). To use this, the *receiver* must have used the + `multistream/advertise` To inform the initiator of *it's* mapping between + protocols and contextual IDs. + +Second, this document proposes 2 auxiliary protocols that can be used with the 4 +multistream protocols to actually negotiate protocols. These are *primarily* +useful (a) in packet-based protocols (without sessions) and (b) when initially +negotiating a transport session (before protocols have been advertised and the +stream multiplexer has been configured). + +1. `serial-stream`: A simple stream "multiplexer" that can multiplex multiple + streams *in serial* over the same connection. That is, it allows us to + negotiate a protocol, use it, and then return to multistream. It also allows + us to speculatively choose a single protocol and then drop back down to + multistream if that doesn't work. +2. `speculative-stream`: A speculative stream "multiplexer" where the initiator + can speculatively initiate multiple streams and the receiver must select at + most one and discard the others. All peers *must* implement `multistream/use` and *should* implement -`serial-multiplex`. This combination will allow us to apply a series of quick +`serial-stream`. This combination will allow us to apply a series of quick connection upgrades (e.g., to multistream 3.0) with no round trips and no funny business (learn from past mistakes). -These protocols were *also* designed to eventually support: - -1. Hardware. While we *do* use varints, we avoid using them for lengths in the - fast-path protocols (the non-negotiating ones). -2. Packet protocols. All protocols described here are actually unidirectional - (at the protocol level, at least) and can work over packet protocols (where - the end of the packet is an "EOF"). - Notes: 1. The "ls" feature of multistream has been removed. While useful, this really - should be a *protocol*. Given the `serial-multiplex` protocol, this shouldn't be - an issue. -2. To reduce RTTs, all protocols are unidirectional. Even the negotiation - protocol, `multistream/choose` (see below for details). + should be a *protocol*. Given the `serial-stream` protocol, this shouldn't be + an issue as we can run as many sub-protocols over the same stream as we want. +2. To reduce RTTs, all protocols are unidirectional. +3. These protocols were *also* designed to eventually support packet protocols + (the other reason to be unidirectional and a strong motivator for the + `speculative-stream` and `serial-stream` protocols). + +### Multistream Advertise + +Unspeced (for now). Really, we just need to send a mapping of protocol +names/codecs to contextual IDs (and may be some service discovery information). +Basically, identify. ### Multistream Use -The multistream/use protocol is simply two varint multicodecs: the +The `multistream/use` protocol is simply two varint multicodecs: the multistream-use multicodec followed by the multicodec for the protocol to be used. This protocol is *unidirectional*. If the stream is bidirectional, the -receiver will, by convention, respond the same way. +receiver must acknowledge a successful protocol negotiation by responding with +the same multistream-use protocol sequence. Every stream starts with multistream-use. Every other protocol defined here will -be assigned a multicodec and selected with multistream/use. +be assigned a multicodec and selected with `multistream/use.` This protocol should *also* be trivial to optimize in hardware simply by prefix matching (i.e., matching on the first N (usually 16-32) bits of the @@ -113,44 +85,38 @@ stream/message). * [ ] Q: Should we somehow distinguish between initiator and receiver? Should we distinguish between bidirectional and unidirectional? We could even bit pack these options into a single byte and use this instead of the leading - multicodec... Note: distinguishing between bidirectional and - unidirectional may actually be necessary to be able to eagerly send a - unidirectional `multistream/hello` message. + multicodec... ### Multistream Dynamic -The multistream/dynamic protocol is like the multistream/use protocol *except* -that it uses a string to identify the protocol. To do so, the initiator simply -sends a fixed-size 16bit length followed by the name of the protocol. +The `multistream/dynamic` protocol is like the `multistream/use` protocol +*except* that it uses a string to identify the protocol. To do so, the initiator +simply sends a varint length followed by the name of the protocol. -Including the multistream/use portion, the initiator would send: +Including the `multistream/use` portion, the initiator would send: ``` - + ``` -Design Note: We *could* use a varint and save a byte in many cases however: - -1. We'd either need to buffer the connection or read the varint byte-by-byte. - Neither of those are really optimal. -2. The length of the name will be dwarf this extra byte. -3. If anyone needs a 64byte name, they're using the *wrong protocol*. Really, - a single byte length should be sufficient for all reasonable protocol names - but we're being stupidly conservative here. +Note: This used to use a fixed-width 16 bit number for a length. However, a +varint *really* isn't going to cost us much, if anything, in terms of +performance as most protocol names will be <= 128 bytes long. On the other hand, +using different number formats everywhere *will* cost us in terms of complexity. ### Multistream Contextual -The multistream/contextual protocol is used to select a protocol using a +The `multistream/contextual` protocol is used to select a protocol using a *receiver specified*, session-ephemeral protocol ID. These IDs are analogues of ephemeral ports. -In this protocol, the stream initiator sends a 16 bit ID specified by the +In this protocol, the stream initiator sends a varint ID specified by the *receiver* to the receiver. This is a *unidirectional* protocol. Format: ``` - + ``` The ID 0 is reserved for saying "same protocol" on a bidirectional stream. The @@ -166,16 +132,16 @@ This protocol has *also* been designed to be hardware friendly: IDs are chosen by the *receiver* means that the receiver can reuse the same IDs for all connected peers (reusing the same hardware routing table). -* [ ] TODO: Just use a varint? Hardware can still do prefix matching and/or only -support IDs that are at most two bytes long. +Note: This *also* used to use 16 bit numbers. However, again, most peers will +have <= 128 protocols. Worse, peers may want to use multistream as a more +general-purpose stream router and may need to repeatedly allocate and then +deallocate contextual IDs. At the end of the day, it's probably better to just +be flexible. -### Multistream Choose +### Speculative Stream -**WARNING:** this may be too complex/magical. However, it has some really nice -properties. We could also go with a more standard I send you a list of protocols -and you pick one approach but I'd like to consider this one. -The multistream/choose protocol allows an initiator to start multiple streams in +The `speculative-stream` protocol allows an initiator to start multiple streams in parallel while telling the receiver to only *act* on one of them. This: 1. Allows us to "negotiate" each stream using the other multistream protocols. @@ -216,19 +182,12 @@ appropriate protocol. way, one could jump to the correct section immediately. * Have a single list of "sections", no stream numbers. Stream numbers would be inferred by index. This is slightly smaller but not very flexible. -* [ ] Q: Avoid varints? * [ ] Q: Just do something simpler? -### Multistream Hello +### Serial Stream -Unspeced (for now). Really, we just need to send a mapping of protocol -names/codecs to contextual IDs (and may be some service discovery information). -Basically, identify. - -### Serial Multiplex - -The `serial-multiplex` protocol is the simplest possible stream multiplexer. -Unlike other stream multiplexers, `serial-multiplex` can only multiplex streams +The `serial-stream` protocol is the simplest possible stream multiplexer. +Unlike other stream multiplexers, `serial-stream` can only multiplex streams in *serial*. That is, it has to close the current stream to open a new one. Also unlike most multiplexers, this multiplexer is *unidirectional*. @@ -242,9 +201,9 @@ The protocol is: Where the header is: * -2 - Send a reset and return to multistream. All queued data (remote and + local) should be discarded. * -1 - Close: Send an EOF and return to multistream. * 0 - Rest: Ends the reuse protocol, transitioning to a direct stream. - local) should be discarded. * >0 - Data: The header indicates the length of the data. We could also use a varint but it's not really worth it. The 16 bit integer @@ -252,11 +211,11 @@ makes implementing this protocol trivial, even in hardware. Why: This allows us to: -1. Try protocols and fall back on others (we can also use `multistream/choose` +1. Try protocols and fall back on others (we can also use `speculative-stream` for this). 2. More importantly, it allows us to speak a bunch of protocols before setting up a stream multiplexer. Specifically, we can use this for - `multistream/hello` to send a hello as early as possible. + `multistream/advertise` to send an advertisement as early as possible. ## Upgrade Path @@ -289,3 +248,156 @@ in which case it'll switch to multistream 2.0. Importantly: When we switch to multistream 2.0, we'll tag the connection (and any sub connections) with the multistream version. This way, we never have to do this again. + +## Example + +So, that was way too much how and not enough why or WTF? Let's try an example +where, + +1. The initiator supports TLS1.3 and SECIO. +2. The receiver only supports TLS1.3. +3. They both support yamux. +4. They both support DHT. +5. secio and tls have multicodecs but yamux and dht don't. + +If we're still in the transition period, the initiator would start off by sending: + +``` +/multistream/1.0\n +/multistream/2.0\n +``` + +If the receiver DOES NOT support multistream 2.0, it will reply with: + +``` +/multistream/1.0\n +na\n +``` + +At this point, the client will fall back on multistream 1.0. + +Otherwise, the receiver will send back... + +``` +/multistream/1.0\n +/multistream/2.0\n +``` + +...to complete the upgrade. + +We're now in multistream 2.0 land. Once we're done with the transition period, +we'll start here to skip a round-trip. + +Now that we're using multistream 2.0, the initiator will send, in a single +packet: + +``` + // use multistream/use to select speculative-stream + <0 (stream number, varint)> // in alt stream 0 + // select SECIO + // initiate SECIO + <1 (stream number, varint)> // in alt stream 1 + // select TLS + // initiate TLS +``` + +The code to do this will likely look roughly like: + +```go +streams := multistream.XOR(stream, ProtocolTLS, ProtocolSecIO) +var wg sync.WaitGroup +wg.Add(2) +var ( + secioConn, tlsConn net.Conn + secioErr, tlsErr error +) +go func() { + defer wg.Done() + secioConn, tlsErr = tls.Upgrade(streams[0]) + ... +}() + +go func() { + defer wg.Done() + tlsConn, tlsErr = secio.Upgrade(streams[1]) + ... +}() + +wg.Wait() + +switch { +case tlsErr == nil: + return tlsConn +case secioConn == nil: + return secioConn +default: + return (some error) +} +``` + + +The receiver will respond with: + +``` + // use multistream/use to select speculative-stream + <1 (stream number, varint)>0 // choose stream 1 + + // respond to the "use tls" protocol + // speak tls +``` + +The speculative stream handler will likely just try each stream in-order, +selecting the first stream that ends up negotiating a known protocol. More +advanced implementations may allow for speculative stream *handlers* to select +from within multiple known protocols. However, this is unlikely to be necessary +for a while. + +Finally, the initiator will finish the TLS negotiation, send a advertise packet, +*optimistically* negotiate yamux (it could also use speculative-stream to +negotiate both at the same time but let's not), and sends the DHT request. + +``` + <1 (stream number)>0 // choose stream 1 + + // finish TLS + + // use serial-stream to make the stream recoverable + // serial-stream message framing + // select advertise protocol + // comlete advertise information (protocols, etc.) + -1 // return to multistream (EOF) + + // open a new serial-stream + + // select multistream/dynamic + /yamux/1.0.0 // select yamux + // create the stream + // select multistream/dynamic + /ipfs/kad/1.0.0 // select kad dht 1.0 + // send the DHT request +``` + +And the receiver will send: + +``` + // use serial-stream to make the stream recoverable + // serial-stream message framing + // select advertise protocol + // comlete advertise information (protocols, etc.) + -1 // return to multistream (EOF) + + // open a new serial-stream + -1 // transition to that stream (we speak yamux) + + // select multistream/dynamic + /yamux/1.0.0 // select yamux + // respond to the new yamux stream + // select multistream/dynamic + + /ipfs/kad/1.0.0 // select kad dht + // send the DHT response +``` + +Note: Ideally, we'd be able to avoid the optimistic yamux negotiation. However, +to do that, some protocol information will have to be embedded in the TLS +negotiation and exposed through a connection-level `Stat` method. From dbb3ec2a8f8c8ed8765a68948ca5583269a04a3a Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Thu, 8 Nov 2018 09:18:19 -0800 Subject: [PATCH 03/15] fix incomplete sentence --- multistream-2.0/spec.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 36f513bb6..3bbeba6a4 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -17,8 +17,8 @@ First, this document proposes 4 protocol "negotiation" protocols. "Negotiation" is in quotes because none of these protocols actually involve negotiating anything. -1. `multistream/advertise`: Inform the remote end about which protocols we speak - and. This should partially replace the current identify protocol. +1. `multistream/advertise`: Inform the remote end about which protocols we + speak. This should partially replace the current identify protocol. 2. `multistream/use`: Selects the stream's protocol using a multicodec. 3. `multistream/dynamic`: Selects the stream's protocol using a string protocol name. 4. `multistream/contextual`: Selects the stream's protocol using a protocol ID From 6a16dc002415c0a7225b794c167dea03818a183a Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Thu, 8 Nov 2018 09:25:26 -0800 Subject: [PATCH 04/15] clarify some things --- multistream-2.0/spec.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 3bbeba6a4..479996f53 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -40,7 +40,8 @@ stream multiplexer has been configured). multistream if that doesn't work. 2. `speculative-stream`: A speculative stream "multiplexer" where the initiator can speculatively initiate multiple streams and the receiver must select at - most one and discard the others. + most one and discard the others. On a bidirectional stream, the receiver will + inform the initiator of the selected sub-stream, collapsing the state. All peers *must* implement `multistream/use` and *should* implement `serial-stream`. This combination will allow us to apply a series of quick @@ -61,7 +62,7 @@ Notes: Unspeced (for now). Really, we just need to send a mapping of protocol names/codecs to contextual IDs (and may be some service discovery information). -Basically, identify. +This is the subset of identify needed for protocol negotiation. ### Multistream Use From d8cfdc7e37d8806ea6b41d96da3eaffa33349238 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Wed, 14 Nov 2018 11:59:56 -0800 Subject: [PATCH 05/15] clarifications from CR --- multistream-2.0/spec.md | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 479996f53..3c7d2f4e2 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -68,9 +68,9 @@ This is the subset of identify needed for protocol negotiation. The `multistream/use` protocol is simply two varint multicodecs: the multistream-use multicodec followed by the multicodec for the protocol to be -used. This protocol is *unidirectional*. If the stream is bidirectional, the -receiver must acknowledge a successful protocol negotiation by responding with -the same multistream-use protocol sequence. +used. This protocol supports unidirectional streams. If the stream is +bidirectional, the receiver must acknowledge a successful protocol negotiation +by responding with the same multistream-use protocol sequence. Every stream starts with multistream-use. Every other protocol defined here will be assigned a multicodec and selected with `multistream/use.` @@ -112,7 +112,7 @@ The `multistream/contextual` protocol is used to select a protocol using a ephemeral ports. In this protocol, the stream initiator sends a varint ID specified by the -*receiver* to the receiver. This is a *unidirectional* protocol. +*receiver* to the receiver. Format: @@ -148,7 +148,8 @@ parallel while telling the receiver to only *act* on one of them. This: 1. Allows us to "negotiate" each stream using the other multistream protocols. That is, each message/sub-stream recursively uses multistream. 2. Pack data into the initial packet to shave off a RTT in many cases. -3. Support packet transports out of the box. +3. Support packet transports out of the box where round-trips may not be + possible. Each message in this protocol consists of: @@ -157,25 +158,24 @@ Each message in this protocol consists of: ``` -The initiator can transition to a single one of these streams by sending: +The where the receiver can transition to a single one of these streams by +sending: ``` 0 ``` -This effectively aborts all the other streams, allowing the chosen stream to -completely take over the channel. +And the initiator responds the same way to finish off the transition. -To actually *select* a protocol on a bidirectional channel, the receiver simply -uses one of the other multistream protocols to pick a protocol. +This aborts all the other streams, allowing the chosen stream to completely take +over the channel. Note: A *simple* implementation of this protocol would simply send a sequence of protocols as `......` and then wait for the other side to select the appropriate protocol. - * [ ] Q: The current framing system is dead simple but inefficient in some cases. Specifically, one can't just (a) read a *single* header and then (b) jump to the desired sub-stream. Alternatives include: @@ -189,8 +189,7 @@ appropriate protocol. The `serial-stream` protocol is the simplest possible stream multiplexer. Unlike other stream multiplexers, `serial-stream` can only multiplex streams -in *serial*. That is, it has to close the current stream to open a new one. Also -unlike most multiplexers, this multiplexer is *unidirectional*. +in *serial*. That is, it has to close the current stream to open a new one. The protocol is: From 1278c7fce0488e67255b03322a382eb749c6ca62 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Tue, 15 Jan 2019 10:47:04 +0000 Subject: [PATCH 06/15] multistream: resolve some questions/discussions --- multistream-2.0/spec.md | 15 --------------- 1 file changed, 15 deletions(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 3c7d2f4e2..9976d7609 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -79,15 +79,6 @@ This protocol should *also* be trivial to optimize in hardware simply by prefix matching (i.e., matching on the first N (usually 16-32) bits of the stream/message). -* [ ] Q: Technically, the first multicodec is redundant. However, it acts as a - magic byte that allows us to figure out what's going on. Should we keep - it? We could just start all streams with a single multicodec representing - the protocol -* [ ] Q: Should we somehow distinguish between initiator and receiver? Should we - distinguish between bidirectional and unidirectional? We could even bit - pack these options into a single byte and use this instead of the leading - multicodec... - ### Multistream Dynamic The `multistream/dynamic` protocol is like the `multistream/use` protocol @@ -133,12 +124,6 @@ This protocol has *also* been designed to be hardware friendly: IDs are chosen by the *receiver* means that the receiver can reuse the same IDs for all connected peers (reusing the same hardware routing table). -Note: This *also* used to use 16 bit numbers. However, again, most peers will -have <= 128 protocols. Worse, peers may want to use multistream as a more -general-purpose stream router and may need to repeatedly allocate and then -deallocate contextual IDs. At the end of the day, it's probably better to just -be flexible. - ### Speculative Stream From 75ca7ca67bdf49553259ddb282e17c0ae88672c6 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Tue, 15 Jan 2019 11:11:29 +0000 Subject: [PATCH 07/15] drop speculative stream --- multistream-2.0/spec.md | 151 +++++++++++----------------------------- 1 file changed, 41 insertions(+), 110 deletions(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 9976d7609..424c6c9bc 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -27,8 +27,8 @@ anything. `multistream/advertise` To inform the initiator of *it's* mapping between protocols and contextual IDs. -Second, this document proposes 2 auxiliary protocols that can be used with the 4 -multistream protocols to actually negotiate protocols. These are *primarily* +Second, this document proposes an auxiliary protocol that can be used with the 4 +multistream protocols to actually negotiate protocols. This is *primarily* useful (a) in packet-based protocols (without sessions) and (b) when initially negotiating a transport session (before protocols have been advertised and the stream multiplexer has been configured). @@ -38,10 +38,6 @@ stream multiplexer has been configured). negotiate a protocol, use it, and then return to multistream. It also allows us to speculatively choose a single protocol and then drop back down to multistream if that doesn't work. -2. `speculative-stream`: A speculative stream "multiplexer" where the initiator - can speculatively initiate multiple streams and the receiver must select at - most one and discard the others. On a bidirectional stream, the receiver will - inform the initiator of the selected sub-stream, collapsing the state. All peers *must* implement `multistream/use` and *should* implement `serial-stream`. This combination will allow us to apply a series of quick @@ -54,9 +50,11 @@ Notes: should be a *protocol*. Given the `serial-stream` protocol, this shouldn't be an issue as we can run as many sub-protocols over the same stream as we want. 2. To reduce RTTs, all protocols are unidirectional. -3. These protocols were *also* designed to eventually support packet protocols - (the other reason to be unidirectional and a strong motivator for the - `speculative-stream` and `serial-stream` protocols). +3. These protocols were *also* designed to eventually support packet protocols. +4. We considered a `speculative-stream` protocol where the initiator + speculatively starts multiple streams and the receiver acts on at most one. + This would have allowed for 0-RTT worst-case protocol negotiation but was + deemed too complicated for inclusion in the core spec. ### Multistream Advertise @@ -124,52 +122,6 @@ This protocol has *also* been designed to be hardware friendly: IDs are chosen by the *receiver* means that the receiver can reuse the same IDs for all connected peers (reusing the same hardware routing table). -### Speculative Stream - - -The `speculative-stream` protocol allows an initiator to start multiple streams in -parallel while telling the receiver to only *act* on one of them. This: - -1. Allows us to "negotiate" each stream using the other multistream protocols. - That is, each message/sub-stream recursively uses multistream. -2. Pack data into the initial packet to shave off a RTT in many cases. -3. Support packet transports out of the box where round-trips may not be - possible. - -Each message in this protocol consists of: - -``` - - -``` - -The where the receiver can transition to a single one of these streams by -sending: - -``` - -0 -``` - -And the initiator responds the same way to finish off the transition. - -This aborts all the other streams, allowing the chosen stream to completely take -over the channel. - -Note: A *simple* implementation of this protocol would simply send a sequence of -protocols as `......` and then wait for the other side to select the -appropriate protocol. - -* [ ] Q: The current framing system is dead simple but inefficient in some - cases. Specifically, one can't just (a) read a *single* header and then - (b) jump to the desired sub-stream. Alternatives include: - * Have a single header that maps stream numbers to offsets and lengths. This - way, one could jump to the correct section immediately. - * Have a single list of "sections", no stream numbers. Stream numbers would be - inferred by index. This is slightly smaller but not very flexible. -* [ ] Q: Just do something simpler? - ### Serial Stream The `serial-stream` protocol is the simplest possible stream multiplexer. @@ -196,8 +148,7 @@ makes implementing this protocol trivial, even in hardware. Why: This allows us to: -1. Try protocols and fall back on others (we can also use `speculative-stream` - for this). +1. Try protocols and fall back on others. 2. More importantly, it allows us to speak a bunch of protocols before setting up a stream multiplexer. Specifically, we can use this for `multistream/advertise` to send an advertisement as early as possible. @@ -277,72 +228,48 @@ Now that we're using multistream 2.0, the initiator will send, in a single packet: ``` - // use multistream/use to select speculative-stream - <0 (stream number, varint)> // in alt stream 0 - // select SECIO - // initiate SECIO - <1 (stream number, varint)> // in alt stream 1 - // select TLS - // initiate TLS -``` + // use serial-stream to make the stream recoverable + // serial-stream message framing + // select advertise protocol + supported security protocols... // + -1 // return to multistream (EOF) -The code to do this will likely look roughly like: - -```go -streams := multistream.XOR(stream, ProtocolTLS, ProtocolSecIO) -var wg sync.WaitGroup -wg.Add(2) -var ( - secioConn, tlsConn net.Conn - secioErr, tlsErr error -) -go func() { - defer wg.Done() - secioConn, tlsErr = tls.Upgrade(streams[0]) - ... -}() - -go func() { - defer wg.Done() - tlsConn, tlsErr = secio.Upgrade(streams[1]) - ... -}() - -wg.Wait() - -switch { -case tlsErr == nil: - return tlsConn -case secioConn == nil: - return secioConn -default: - return (some error) -} + // open a new serial-stream + + // select TLS + // initiate TLS ``` - The receiver will respond with: ``` - // use multistream/use to select speculative-stream - <1 (stream number, varint)>0 // choose stream 1 + // respond to serial stream + + // select advertise protocol + security protocols... + -1 // return to multistream (EOF) - // respond to the "use tls" protocol - // speak tls + // respond to second serial stream + 0 // transition to a normal stream. + // select TLS + // complete TLS handshake ``` -The speculative stream handler will likely just try each stream in-order, -selecting the first stream that ends up negotiating a known protocol. More -advanced implementations may allow for speculative stream *handlers* to select -from within multiple known protocols. However, this is unlikely to be necessary -for a while. +This: + +1. Responds to the advertisement, also advertising available security protocols. +2. Accepts the TLS stream. +3. Finishes the TLS handshake. + +If the receiver had *not* supported TLS, it would have reset the serial-stream. +In that case, the initiator would have used the protocols advertised by the +receiver to select an appropriate security protocol. Finally, the initiator will finish the TLS negotiation, send a advertise packet, -*optimistically* negotiate yamux (it could also use speculative-stream to -negotiate both at the same time but let's not), and sends the DHT request. +*optimistically* negotiate yamux, and sends the DHT request. ``` - <1 (stream number)>0 // choose stream 1 + 0 // transition to a normal stream. // finish TLS @@ -386,3 +313,7 @@ And the receiver will send: Note: Ideally, we'd be able to avoid the optimistic yamux negotiation. However, to do that, some protocol information will have to be embedded in the TLS negotiation and exposed through a connection-level `Stat` method. + +Alternatively, we could choose to include this information in the advertisement +sent *before* the security transport. However, that has some security +implications. From 8ea2f406382c0808268ed9deac7c41804485bf89 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Tue, 15 Jan 2019 18:31:37 +0000 Subject: [PATCH 08/15] multistream: nit --- multistream-2.0/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 424c6c9bc..dc8db00b5 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -4,7 +4,7 @@ This proposal describes a replacement protocol for multistream-select. ## Protocols -This document proposes 6 new, micro-protocols with two guiding principles: +This document proposes 5 new, micro-protocols with two guiding principles: 1. Composition over complexity. 2. Every byte and round-trip counts. From e273037c06361d15d58a6ef0085644f689f9408f Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Thu, 17 Jan 2019 10:59:41 +0000 Subject: [PATCH 09/15] fix formatting --- multistream-2.0/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index dc8db00b5..8ee2638a4 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -141,7 +141,7 @@ Where the header is: local) should be discarded. * -1 - Close: Send an EOF and return to multistream. * 0 - Rest: Ends the reuse protocol, transitioning to a direct stream. -* >0 - Data: The header indicates the length of the data. +* >0 - Data: The header indicates the length of the data. We could also use a varint but it's not really worth it. The 16 bit integer makes implementing this protocol trivial, even in hardware. From fcc5ac3a42ff59b2872fd186288dcf75791e8d76 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Thu, 17 Jan 2019 11:05:57 +0000 Subject: [PATCH 10/15] multistream: improve naming * use -> multicodec * dynamic -> string * contextual -> dynamic --- multistream-2.0/spec.md | 128 ++++++++++++++++++++-------------------- 1 file changed, 64 insertions(+), 64 deletions(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 8ee2638a4..e8925bae6 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -19,13 +19,13 @@ anything. 1. `multistream/advertise`: Inform the remote end about which protocols we speak. This should partially replace the current identify protocol. -2. `multistream/use`: Selects the stream's protocol using a multicodec. -3. `multistream/dynamic`: Selects the stream's protocol using a string protocol name. -4. `multistream/contextual`: Selects the stream's protocol using a protocol ID +2. `multistream/multicodec`: Selects the stream's protocol using a multicodec. +3. `multistream/string`: Selects the stream's protocol using a string protocol name. +4. `multistream/dynamic`: Selects the stream's protocol using a protocol ID defined by the *receiver*, valid for the duration of the "session" (underlying connection). To use this, the *receiver* must have used the `multistream/advertise` To inform the initiator of *it's* mapping between - protocols and contextual IDs. + protocols and dynamic IDs. Second, this document proposes an auxiliary protocol that can be used with the 4 multistream protocols to actually negotiate protocols. This is *primarily* @@ -39,7 +39,7 @@ stream multiplexer has been configured). us to speculatively choose a single protocol and then drop back down to multistream if that doesn't work. -All peers *must* implement `multistream/use` and *should* implement +All peers *must* implement `multistream/multicodec` and *should* implement `serial-stream`. This combination will allow us to apply a series of quick connection upgrades (e.g., to multistream 3.0) with no round trips and no funny business (learn from past mistakes). @@ -59,34 +59,34 @@ Notes: ### Multistream Advertise Unspeced (for now). Really, we just need to send a mapping of protocol -names/codecs to contextual IDs (and may be some service discovery information). +names/codecs to dynamic IDs (and may be some service discovery information). This is the subset of identify needed for protocol negotiation. ### Multistream Use -The `multistream/use` protocol is simply two varint multicodecs: the +The `multistream/multicodec` protocol is simply two varint multicodecs: the multistream-use multicodec followed by the multicodec for the protocol to be used. This protocol supports unidirectional streams. If the stream is bidirectional, the receiver must acknowledge a successful protocol negotiation by responding with the same multistream-use protocol sequence. Every stream starts with multistream-use. Every other protocol defined here will -be assigned a multicodec and selected with `multistream/use.` +be assigned a multicodec and selected with `multistream/multicodec.` This protocol should *also* be trivial to optimize in hardware simply by prefix matching (i.e., matching on the first N (usually 16-32) bits of the stream/message). -### Multistream Dynamic +### Multistream String -The `multistream/dynamic` protocol is like the `multistream/use` protocol +The `multistream/string` protocol is like the `multistream/multicodec` protocol *except* that it uses a string to identify the protocol. To do so, the initiator simply sends a varint length followed by the name of the protocol. -Including the `multistream/use` portion, the initiator would send: +Including the `multistream/multicodec` portion, the initiator would send: ``` - + ``` Note: This used to use a fixed-width 16 bit number for a length. However, a @@ -94,9 +94,9 @@ varint *really* isn't going to cost us much, if anything, in terms of performance as most protocol names will be <= 128 bytes long. On the other hand, using different number formats everywhere *will* cost us in terms of complexity. -### Multistream Contextual +### Multistream Dynamic -The `multistream/contextual` protocol is used to select a protocol using a +The `multistream/dynamic` protocol is used to select a protocol using a *receiver specified*, session-ephemeral protocol ID. These IDs are analogues of ephemeral ports. @@ -106,19 +106,19 @@ In this protocol, the stream initiator sends a varint ID specified by the Format: ``` - + ``` The ID 0 is reserved for saying "same protocol" on a bidirectional stream. The -receiver of a bidirectional stream can't reuse the same contextual ID that the -initiator used as this contextual ID is relative *to* the receiver. Really, this +receiver of a bidirectional stream can't reuse the same dynamic ID that the +initiator used as this dynamic ID is relative *to* the receiver. Really, this last rule *primarily* exists to side-step the TCP simultaneous connect issue. This protocol has *also* been designed to be hardware friendly: 1. Hardware can compare the first 16 bits of the message against - ``. -2. It can then route the message based on the contextual ID. The fact that these + ``. +2. It can then route the message based on the dynamic ID. The fact that these IDs are chosen by the *receiver* means that the receiver can reuse the same IDs for all connected peers (reusing the same hardware routing table). @@ -228,31 +228,31 @@ Now that we're using multistream 2.0, the initiator will send, in a single packet: ``` - // use serial-stream to make the stream recoverable - // serial-stream message framing - // select advertise protocol - supported security protocols... // - -1 // return to multistream (EOF) + // use serial-stream to make the stream recoverable + // serial-stream message framing + // select advertise protocol + supported security protocols... // + -1 // return to multistream (EOF) - // open a new serial-stream + // open a new serial-stream - // select TLS - // initiate TLS + // select TLS + // initiate TLS ``` The receiver will respond with: ``` - // respond to serial stream + // respond to serial stream - // select advertise protocol + // select advertise protocol security protocols... - -1 // return to multistream (EOF) + -1 // return to multistream (EOF) - // respond to second serial stream - 0 // transition to a normal stream. - // select TLS - // complete TLS handshake + // respond to second serial stream + 0 // transition to a normal stream. + // select TLS + // complete TLS handshake ``` This: @@ -269,45 +269,45 @@ Finally, the initiator will finish the TLS negotiation, send a advertise packet, *optimistically* negotiate yamux, and sends the DHT request. ``` - 0 // transition to a normal stream. + 0 // transition to a normal stream. - // finish TLS + // finish TLS - // use serial-stream to make the stream recoverable - // serial-stream message framing - // select advertise protocol - // comlete advertise information (protocols, etc.) - -1 // return to multistream (EOF) + // use serial-stream to make the stream recoverable + // serial-stream message framing + // select advertise protocol + // comlete advertise information (protocols, etc.) + -1 // return to multistream (EOF) - // open a new serial-stream + // open a new serial-stream - // select multistream/dynamic - /yamux/1.0.0 // select yamux - // create the stream - // select multistream/dynamic - /ipfs/kad/1.0.0 // select kad dht 1.0 - // send the DHT request + // select multistream/string + /yamux/1.0.0 // select yamux + // create the stream + // select multistream/string + /ipfs/kad/1.0.0 // select kad dht 1.0 + // send the DHT request ``` And the receiver will send: ``` - // use serial-stream to make the stream recoverable - // serial-stream message framing - // select advertise protocol - // comlete advertise information (protocols, etc.) - -1 // return to multistream (EOF) - - // open a new serial-stream - -1 // transition to that stream (we speak yamux) - - // select multistream/dynamic - /yamux/1.0.0 // select yamux - // respond to the new yamux stream - // select multistream/dynamic - - /ipfs/kad/1.0.0 // select kad dht - // send the DHT response + // use serial-stream to make the stream recoverable + // serial-stream message framing + // select advertise protocol + // comlete advertise information (protocols, etc.) + -1 // return to multistream (EOF) + + // open a new serial-stream + -1 // transition to that stream (we speak yamux) + + // select multistream/string + /yamux/1.0.0 // select yamux + // respond to the new yamux stream + // select multistream/string + + /ipfs/kad/1.0.0 // select kad dht + // send the DHT response ``` Note: Ideally, we'd be able to avoid the optimistic yamux negotiation. However, From 1ccfa7b88c798a40a2db8144d579609884c70a27 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Thu, 17 Jan 2019 11:13:23 +0000 Subject: [PATCH 11/15] multistream: update unidirectional comment --- multistream-2.0/spec.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index e8925bae6..97e61e622 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -49,7 +49,9 @@ Notes: 1. The "ls" feature of multistream has been removed. While useful, this really should be a *protocol*. Given the `serial-stream` protocol, this shouldn't be an issue as we can run as many sub-protocols over the same stream as we want. -2. To reduce RTTs, all protocols are unidirectional. +2. All multistream-2 protocols are unidirectional. On a bidirectional stream, + these protocols are run once in each direction with the receiver mirroring + the initiator. 3. These protocols were *also* designed to eventually support packet protocols. 4. We considered a `speculative-stream` protocol where the initiator speculatively starts multiple streams and the receiver acts on at most one. From 398355761bb611106c6de98d35e61ad4ba6ec80e Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Thu, 17 Jan 2019 11:15:21 +0000 Subject: [PATCH 12/15] multistream: clarify serial stream reset --- multistream-2.0/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 97e61e622..8fb4cd2f4 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -139,7 +139,7 @@ The protocol is: Where the header is: -* -2 - Send a reset and return to multistream. All queued data (remote and +* -2 - Abnormal End: Send a reset and return to multistream. All queued data (remote and local) should be discarded. * -1 - Close: Send an EOF and return to multistream. * 0 - Rest: Ends the reuse protocol, transitioning to a direct stream. From eeaea23027e9c3b9d0b1b47b8338c2f1ad9fe7a8 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Tue, 22 Jan 2019 16:33:43 -0800 Subject: [PATCH 13/15] multistream: add a protocol for handling simultanious open TODO: Move this elsewhere. It's not a part of multistream and is only relevant because it came up in the retro. --- multistream-2.0/spec.md | 69 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 8fb4cd2f4..551b52084 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -187,6 +187,75 @@ Importantly: When we switch to multistream 2.0, we'll tag the connection (and any sub connections) with the multistream version. This way, we never have to do this again. +## TCP Simultaneous Open + +As noted in the [retrospective](retrospective.md), multistream 1.0 doesn't +provide any way to distinguish between the initiator and the receiver of a +stream. Multistream 2.0 doesn't either but it *does* allow us to handle the TCP +Simultaneous Open case without needing any additional round-trips in the fast +path. + +To make this work, we need a new protocol: "duplex-stream" (or whatever we want +to call it). This protocol allows one to bind two unidirectional streams +together into a single bidirectional stream. + +### Protocol + +The protocol is: + +1. The side that wants to be the "initiator" of a duplex stream sends a + "initiate stream ID" message (where ID is randomly generated 256 bit number). +2. The receiver sends back "receive stream ID" on a different unidirectional stream. +3. The two streams are now joined. + +More specifically, + +1. The initiator sends: `0<32 bytes of randomness>` +1. The receiver sends: `1` + +If we end up in a situation where both peers want to be the initiator of a +single pair of unidirectional streams, the peer that picks the *lower* random ID +should back off and act as the receiver. + +### Usage + +We treat each new TCP connection as a pair of unidirectional streams and use +this protocol to bind them together. + +On connect, the initiator(s) will: + +1. Use `serial-stream` to make the stream recoverable. +2. Inside that "serial stream", it'll do the initiator half of the stream + handshake. +3. It'll then start the security negotiation as usual. + +If there is a receiver, it will: + +1. Handle the serial stream. +2. See the "duplex stream initiate". +3. Send a "duplex stream receive" on the other stream. +4. Handle the security negotiation. + +If there are two initiators, they will both. + +1. Handle the serial stream. +2. See the "duplex stream initiate" message. +3. Reset their outbound streams, dropping out of serial stream. +4. The side with the *larger* `RANDOM ID` will try again as the initiator. The + side with the smaller will switch to the receiver role. + +In practice, both sides should actually be quite a bit more flexible here. That +is, they should handle protocols as they're negotiated by the other peer instead +of simply *assuming* that the other peer will negotiate a specific protocol. + +For example, peers may want to send a bunch of unidirectional protocol +advertisements before switching to duplex mode. One or both sides may decide to +*not* use serial-stream to make the underlying connection recoverable (or they +may use it multiple times recursively). + +In other words, both sides should actually treat the read half of the TCP stream +as if it were an inbound unidirectional stream until it's not. + ## Example So, that was way too much how and not enough why or WTF? Let's try an example From 5e3aa0dbb1247963c490536c09ac7c8cd7511db6 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Wed, 23 Jan 2019 04:13:02 -0800 Subject: [PATCH 14/15] multistream: add some examples to the TCP simultanious open stuff --- multistream-2.0/spec.md | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index 551b52084..c3c63c211 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -210,8 +210,9 @@ The protocol is: More specifically, -1. The initiator sends: `0<32 bytes of randomness>` -1. The receiver sends: `1` +1. The initiator generates a 32 byte random ID (`ID`). +2. The initiator negotiates the `duplex-stream` protocol and then sends `0` (`0` is a single 0 byte). +3. The receiver negotiates the `duplex-stream` protocol and then sends `1` (`1` is a single 1 byte). If we end up in a situation where both peers want to be the initiator of a single pair of unidirectional streams, the peer that picks the *lower* random ID @@ -229,6 +230,17 @@ On connect, the initiator(s) will: handshake. 3. It'll then start the security negotiation as usual. +Data sent: + +``` + + + + 0 + + ... (security negotiation and stuff) ... +``` + If there is a receiver, it will: 1. Handle the serial stream. @@ -236,13 +248,23 @@ If there is a receiver, it will: 3. Send a "duplex stream receive" on the other stream. 4. Handle the security negotiation. -If there are two initiators, they will both. +Data sent: + +``` + + 1 + + ... (security negotiation and stuff) ... +``` + +If there are two initiators, they will both: 1. Handle the serial stream. 2. See the "duplex stream initiate" message. 3. Reset their outbound streams, dropping out of serial stream. -4. The side with the *larger* `RANDOM ID` will try again as the initiator. The - side with the smaller will switch to the receiver role. +4. The side with the *larger* `RANDOM ID` will try again as the initiator + (starting over from the top). The side with the smaller will switch to the + receiver role. In practice, both sides should actually be quite a bit more flexible here. That is, they should handle protocols as they're negotiated by the other peer instead From 3c6128243f83344d8ba49ce4006483409697b00a Mon Sep 17 00:00:00 2001 From: Cole Brown Date: Thu, 17 Oct 2019 19:17:19 -0400 Subject: [PATCH 15/15] Add proposal for packet oriented extensions --- multistream-2.0/packet-oriented.md | 40 ++++++++++++++++++++++++++++++ multistream-2.0/spec.md | 3 ++- 2 files changed, 42 insertions(+), 1 deletion(-) create mode 100644 multistream-2.0/packet-oriented.md diff --git a/multistream-2.0/packet-oriented.md b/multistream-2.0/packet-oriented.md new file mode 100644 index 000000000..9ba3cb90b --- /dev/null +++ b/multistream-2.0/packet-oriented.md @@ -0,0 +1,40 @@ +# Packet-oriented Multiselect 2.0 + +This proposal defines the packet-oriented extension of multiselect 2.0. + +## Protocols + +The needs of packet-oriented libp2p differ from those of stream-oriented libp2p +in the following ways: + +- The underlying transports are message based, thus the requirements of a + multiplexer are significantly diminished. Messages across different protocols + are natively interleaved over a connectionless socket, so a multiplexer need + only tag messages with their protocol. There are no ordering or transmission + guarantees in packet-oriented transports such as UDP. +- Packet-oriented communication is inherently unidirectional. Since there is no + notion of a connection or handshake in the underlying transports, + packet-oriented protocols may simply speculatively send along packets to their + peers until they receive a message instructing them otherwise. + +As a result, since the underlying transports (e.g. UDP) delimit individual +messages received by a socket, it stands that it would be reasonable to extend +multiselect 2.0 for the packet-oriented case, adding or augmenting the following +messages: + +- `multiselect/dynamic-inline`: Similar to `multiselect/dynamic`, but doesn't + depend on a previous `multiselect/advertise`. Message is of the format: + + ``` + + ``` + + In this case, a dynamic identifier is established alongside the protocol's + string name. +- `multiselect/dynamic`: An extended version of `multiselect/dynamic` that + supports the optional addition of a payload. This will be used to send + messages on a protocol for which a dynamic ID has been established via an + `advertise` or `dynamic-inline` message. +- `multiselect/na`: Reject a protocol, referring to its dynamic identifier. This + always is in reference to an identifier established by the remote peer. +- A version of `multiselect/multicodec` with optional payload \ No newline at end of file diff --git a/multistream-2.0/spec.md b/multistream-2.0/spec.md index c3c63c211..52f5ca28b 100644 --- a/multistream-2.0/spec.md +++ b/multistream-2.0/spec.md @@ -20,7 +20,8 @@ anything. 1. `multistream/advertise`: Inform the remote end about which protocols we speak. This should partially replace the current identify protocol. 2. `multistream/multicodec`: Selects the stream's protocol using a multicodec. -3. `multistream/string`: Selects the stream's protocol using a string protocol name. +3. `multistream/string`: Selects the stream's protocol using a string protocol + name. 4. `multistream/dynamic`: Selects the stream's protocol using a protocol ID defined by the *receiver*, valid for the duration of the "session" (underlying connection). To use this, the *receiver* must have used the