Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Circuit Relay Specification #22

Merged
merged 5 commits into from
Jul 14, 2017
Merged

The Circuit Relay Specification #22

merged 5 commits into from
Jul 14, 2017

Conversation

daviddias
Copy link
Member

@daviddias daviddias commented Jul 8, 2017

As promised on Friday, July 8, I've cleaned up, coalesced and added what was missing to the Relay spec, taking the bits from PRs #18 and #15 and all the comments.

Hope you enjoy :)

Copy link
Member

@dryajov dryajov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few comments that I think we should address before merging, otherwise looks good!

source and the circuit followed to establish communication between the two.
This provides the destination side with full knowledge of the circuit which,
if needed, could be rebuilt in the opposite direction.
Unlike a transparent **tunnel**, where a libp2p peer would just proxy a communication stream to a destination (the destination being unaware of the original source), a circuit relay makes the destination aware of the original source and the circuit followed to establish communication between the two. This provides the destination side with full knowledge of the circuit which, if needed, could be rebuilt in the opposite direction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this? It seems like there could be a lot of complexity involved in making this work properly, as opposed to leaving the job of re-establishing a dropped connection to the source. So far I have left this out in the js implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if needed, could be rebuilt

It is not handled by the module. It is "if needed, the dst has the information to rebuild the conn back"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| +------- multiaddr of the listening node
|
+------------ multiaddr of the dialing node
message CircuitRelay {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but should we have a version as part of the message as well? I know its communicated as part of the multistream-select rpc end point, but having a version in the message might help preventing stupid mistakes like we made changes to the protobuff msg, but forgot to bump the multistream version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • protobufs have a very clean way to be updated already
  • "we made changes to the protobuf msg, but forgot to bump the multistream version" - this is not a valid technical reason, the truth is that adding a version won't stop you from making any of the mistakes you suggest, in fact, it opens the door to even more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid you're correct, reading a bit more into protobufs clarified this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relay/README.md Outdated
enum Type { // RPC identifier, either HOP, STOP or STATUS
HOP = 1;
STOP = 2;
STATUS = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're returning the status as part of the message, we should probably make the return codes protobuf enums as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dryajov wanna PR that addition?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid sure thing.

Copy link
Member

@dryajov dryajov Jul 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid - here it is #23

Copy link

@JustinDrake JustinDrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice to have a spec for this critical piece of infrastructure 👍

**Events:**
- phase I: Open a request for a relayed stream (A to R).
- A dials a new stream `sAR` to R using protocol `/libp2p/circuit/relay/0.1.0`.
- A sends a CircuitRelay message with `{ type: 'HOP', srcPeer: '/p2p/QmA', dstPeer: '/p2p/QmB' }` to R through `sAR`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this HOP message be immediately acknowledged by R with a message? Opening a new stream sRB can take some time, and it would be good for A to be able to distinguish timeouts where R is slow, or where B is slow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's acknowledged in the sense that:

  • a connection is open
  • a multistream + secio + spdy + multistream again to /libp2p/circuit/relay is done

The next ack is when R finishes sRB

relay/README.md Outdated
## Overview

The circuit relay is a means of establishing connectivity between
libp2p nodes (such as IPFS) that wouldn't otherwise be able to connect to each other.
The circuit relay is a mean to establish connectivity between libp2p nodes (e.g IPFS) that wouldn't otherwise be able to establish a direct connection to each other.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo "mean" should be "means". (The singular of "means" is "means".)

relay/README.md Outdated
| 251 | "failed to parse dst addr: no such protocol ipfs" | The `<dst>` multiaddr in the header was invalid |
| 260 | "passive relay has no connection to dst" | |
| 261 | "active relay could'nt dial to dst: conn refused" | relay could not form new connection to target peer |
| 262 | "could'nd dial to dst: BAD ERROR" | relay has conn to dst, but failed to open a stream |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo "could'nd" -> "couldn't"

- R opens a new stream `sRB` to B using protocol `/libp2p/circuit/relay/0.1.0`.
- R sends a CircuitRelay message with `{ type: 'STOP', srcPeer: '/p2p/QmA', dstPeer: '/p2p/QmB' }` on `sRB`.
- R sends a CircuitRelay message with `{ type: 'STATUS', code: 'OK' }` on `sAR`.
- phase III: Streams are piped together, establishing a circuit
Copy link

@JustinDrake JustinDrake Jul 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where exactly does the piping happen? Does it happen after R sends { type: 'STATUS', code: 'OK' } on sAR? If so, is there a race condition here? Should it happen before R sends that message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's after R sends that { type: 'STATUS', code: 'OK' }. The order in which the items are listed is relevant.

Where is the race condition here?

Copy link

@JustinDrake JustinDrake Jul 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The race condition is as follows. A may try to send a message for B on sAR after receiving { type: 'STATUS', code: 'OK' } from R, but before piping has happened, i.e. before a circuit to B is properly formed. In practice, network latency will likely be high enough that piping will happen in time. Still, piping should maybe happen before sending { type: 'STATUS', code: 'OK' } to avoid the (theoretical) race condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid agree with @JustinDrake in the current js implementation OK is sent after the pipe has been established otherwise there is no way to tell if the pipe is ready without an additional message that would indicate that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If A writes on sAR and sAR hasn't been piped to sRB then the message is buffered (backpressure).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If A writes on sAR and sAR hasn't been piped to sRB then the message is buffered (backpressure).

Backpressure may not completely address the race condition. The reason is that the recipient of messages sent to the sAR stream could be either R or B. As it stands the protocol is unambiguous because after A sends the initial HOP message to R it is not expected to send further messages to R. But in theory the protocol could be extended to allow for A to send further messages to R before the circuit is established. Doing things the other way round may be more futureproof.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this closer, I see another issue that might come up with multihop dialing. In shallow/matreshka dialing mode, there is no way for the R to tell weather the circuit has been completed or not until the last portion of the multihop address is dialed, for that reason I have the STOP handler signaling the OK rather than HOP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgierth @whyrusleeping @diasdavid wanna chime in on this one? Seems like the only contention point right now? I agree with the points that @JustinDrake is bringing up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure i see an issue here, If R sends the message to A, then pipes the connection, any message sent from A to R will just be buffered (or A will not be able to finish the write) until things are piped. It is assumed we are operating over a reliable ordered stream (not something like UDP where packets could get dropped if we don't read them)

Copy link
Member

@dryajov dryajov Jul 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whyrusleeping I guess you're right. I remember dealing with some issues initially, which made me experiment a bit with where the ok is sent, but most likely it was unrelated to the ordering of the messages. I'll give it another go sinse I'll have to rewrite message handling anyways.

relay/README.md Outdated
| 251 | "failed to parse dst addr: no such protocol ipfs" | The `<dst>` multiaddr in the header was invalid |
| 260 | "passive relay has no connection to dst" | |
| 261 | "active relay could'nt dial to dst: conn refused" | relay could not form new connection to target peer |
| 262 | "could'nd dial to dst: BAD ERROR" | relay has conn to dst, but failed to open a stream |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a BAD ERROR? The name "BAD" isn't very descriptive.

relay/README.md Outdated
@@ -1,69 +1,48 @@
# Circuit Relay

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add the version number (0.1.0)

relay/README.md Outdated
Examples:

- `/p2p-circuit/p2p/QmVT6GYwjeeAF5TR485Yc58S3xRF5EFsZ5YAF4VcP3URHt` - Arbitrary relay node
- `/ip4/127.0.0.1/tcp/5002/ipfs/QmdPU7PfRyKehdrP5A3WqmjyD6bhVpU1mLGKppa2FjGDjZ/p2p-circuit/p2p/QmVT6GYwjeeAF5TR485Yc58S3xRF5EFsZ5YAF4VcP3URHt` - Specific relay node

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ipfs should be /p2p. Same with /ipfs/QmTwo => /p2p/QmTwo.


| Code | Message | Meaning |
| ----- |:--------------------------------------------------|:----------:|
| 100 | OK | Relay was setup correctly |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "One leg of the relay was setup correctly", instead of the full relay?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good question, I need it needs to be clarified.

Dialing from A to R self-confirms because of all the libp2p dance (multistream, secio, protocol muxing), we really just need to signal that we managed to reach the other end (B).

This brings be back to => #22 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -73,106 +52,156 @@ Cast:
- QmRelay, a node which speaks the circuit relay protocol (go-ipfs or js-ipfs).

Scene 1:
- QmOne wants to connect to QmTwo,
and through peer routing has acquired a set of addresses of QmTwo.
- QmOne wants to connect to QmTwo, and through peer routing has acquired a set of addresses of QmTwo.
- QmTwo doesn't support any of the transports used by QmOne.
- Awkward silence.

Scene 2:
- All three nodes have learned to speak the `/ipfs/relay/circuit` protocol.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ipfs/relay/circuit should be /libp2p/relay/circuit.

relay/README.md Outdated
## Overview

The circuit relay is a means of establishing connectivity between
libp2p nodes (such as IPFS) that wouldn't otherwise be able to connect to each other.
The circuit relay is a means to establish connectivity between libp2p nodes (e.g IPFS) that wouldn't otherwise be able to establish a direct connection to each other.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(e.g IPFS)" => "(e.g. IPFS nodes)"

(add missing . and clarify)

relay/README.md Outdated
(instead of e.g. TCP.): One node asks a relay node to connect to another node
on its behalf. The relay node shortcircuits its streams to the two nodes,
and they are then connected through the relay.
Apart from that, this relayed connection behaves just like a regular connection would, but over an existing swarm stream with another peer (instead of e.g. TCP.). A node asks a relay node to connect to another node on its behalf. The relay node short-circuits streams between the two nodes, enabling them to reach each other.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(instead of e.g. TCP.)" => remove extraneous dot

`/p2p-circuit` multiaddrs don't carry any meaning of their own.
They need to encapsulate a `/p2p` address, or
be encapsulated in a `/p2p` address, or both.
`/p2p-circuit` multiaddrs don't carry any meaning of their own. They need to encapsulate a `/p2p` address, or be encapsulated in a `/p2p` address, or both.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can a /p2p-circuit multiaddr be encapsulated in a /p2p address without encapsulating a /p2p address? In other words, shouldn't the destination /p2p address be mandatory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording might not be perfect. Could you illustrate the case that doesn't make sense to you?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case that is not allowed is <relay peer multiaddr>/p2p-circuit/ (destination omitted). What about the following wording?

A /p2p-circuit multiaddress needs to encapsulate a /p2p destination address, and optionally be encapsulated by a /p2p relay address.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be added as additional bullet points to the main sentense:

/p2p-circuit multiaddrs don't carry any meaning of their own. They need to encapsulate a /p2p address, or be encapsulated in a /p2p address, or both.

  • /p2p-circuit multiaddress encapsulates a /p2p destination address - e.g. /p2p-circuit/p2p/QmHash
  • can optionally be encapsulated by a /p2p relay address - e.g. /p2p/ipfs/QmRelay/p2p-circuit/p2p/QmDest


We have considered more features but won't be adding them on the first iteration of Circuit Relay, the features are:

- Multihop relay - With this specification, we are only enabling single hop relays to exist. Multihop relay will come at a later stage as Packet Switching.
Copy link
Member

@dryajov dryajov Jul 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The js does implement multihop dialing in the form of shallot/matreshka dialing. This was based on the conversations we had at the time of me starting the implementation, what has changed that we now have scrapped this part?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing wrong with experimentation. It was part of Future Work spec wise from the beginning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. :)

@daviddias
Copy link
Member Author

daviddias commented Jul 12, 2017

From IRC:

06:06 <@daviddias> Seems that other than some miner typos, there are no outstanding issues in the relay spec
06:07 <@daviddias> Shall we merge it today?
06:07 <@whyrusleeping> sounds good to me

📣 If no blockers appear, I shall merge the spec today 📯

@JustinDrake
Copy link

@diasdavid Are you going to address #22 (comment) before merging?

@daviddias
Copy link
Member Author

@JustinDrake addressed last Wednesday

@daviddias
Copy link
Member Author

All right, this is it, no one has raised any other blockers, we can say that this spec is COMPLETE

@daviddias daviddias merged commit 596c4f8 into master Jul 14, 2017
@daviddias daviddias deleted the spec/circuit-relay branch July 14, 2017 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants