-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error codes: provide error codes on stream reset and connection close #623
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable, let's do it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly cosmetic nits. Thanks for putting this together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to create a registry, so application protocols can register ranges of values. Otherwise, it's likely that you'll end up with collisions.
error-codes/README.md
Outdated
In the event that a node detects violation of a protocol or is unable to | ||
complete the necessary steps required for the protocol, it's useful to provide a | ||
reason for disconnection to the other end. This error code can be sent on both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's broader than this. This doesn't only apply to protocol violations (which should be rare), but also to common events like running into resource limits, connections being pruned by the connection manager, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've rephrased this and removed the specific reference to protocol errors.
3b23cde
to
070f090
Compare
Do we need to? Application protocols can have conflicting error codes. From the application context, it's clear which is the relevant error. I am planning to reserve a space for libp2p to use for its errors and let applications do whatever they like with the application error codes space. |
They're not conflicting, since the application protocol is negotiated during the QUIC handshake.
This is not correct, thanks to multistream. |
d0db0b2
to
3060cd0
Compare
I see. All connections are just libp2p connections and they can speak multiple application protocols any of which may close the underlying connection on error. |
196f49e
to
168a610
Compare
``` | ||
|
||
### Multistream Select | ||
Multistream-Select is used to negotiate Security protocol for TCP connections before a stream muxer has been selected. There's only one error code defined for such cases, `PROTOCOL_NEGOTIATION_FAILED`. To encode this error, send the string `101` prefixed with the length and close the TCP connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not too useful. The most common case here is a server rejecting a connection because the handling it exceeds some resource limit. In such cases Writing on the connection and closing will cause the write to be dropped because almost certainly there will be unread data in the read buffer.
## Error Codes Registry | ||
Libp2p connections are shared by multiple applications. The same connection used in the dht may be used for gossip sub, or for any other application. Any of these applications can close the underlying connection on an error, resetting streams used by the other applications. To correctly distinguish which application closed the connection, Connection Close error codes are allocated to applications from a central registry. | ||
|
||
For simplicity, we manage both Connection Close and Stream Reset error codes from a central registry. The libp2p error codes registry is at: https://github.com/libp2p/error-codes/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there much value in having this be a separate repo? Why not have it in the specs repo in a sub folder of error-codes?
Connection Close error code delivery to the other end depends on the OS TCP implementation and the TCP options used for the socket. In particular, when `SO_LINGER` TCP option is set to 0 and the implementation closes the connection immediately after writing the error code containing frame, the error code may not be delivered. | ||
|
||
### WebRTC | ||
There is no way to provide any information on closing a peer connection in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wording here is confusing.
Libp2p streams are reset unilaterally, calling `Reset` on a stream resets both the read and write end of a stream. For transports, like QUIC, which support cancelling the read and write ends of the stream separately, implementations MAY provide the ability to signal error codes separately on resetting either end. | ||
|
||
## Error Codes Registry | ||
Libp2p connections are shared by multiple applications. The same connection used in the dht may be used for gossip sub, or for any other application. Any of these applications can close the underlying connection on an error, resetting streams used by the other applications. To correctly distinguish which application closed the connection, Connection Close error codes are allocated to applications from a central registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should outline a process to allocate these error codes. e.g. something as simple as:
- A user requests a certain number of error codes.
- We ensure it's a reasonable request.
- Verify the number is reasonable (e.g. not 16k errors)
- Verify there is no conflict.
- Allocate that in a high number space.
The process exists so that we can make it easier for future reviewers.
No description provided.