Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error codes: provide error codes on stream reset and connection close #623

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sukunrt
Copy link
Member

@sukunrt sukunrt commented Jul 16, 2024

No description provided.

@sukunrt sukunrt requested review from MarcoPolo and achingbrain July 19, 2024 06:48
error-codes/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable, let's do it!

error-codes/README.md Show resolved Hide resolved
Copy link
Contributor

@yiannisbot yiannisbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly cosmetic nits. Thanks for putting this together.

error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to create a registry, so application protocols can register ranges of values. Otherwise, it's likely that you'll end up with collisions.

Comment on lines 4 to 6
In the event that a node detects violation of a protocol or is unable to
complete the necessary steps required for the protocol, it's useful to provide a
reason for disconnection to the other end. This error code can be sent on both
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's broader than this. This doesn't only apply to protocol violations (which should be rare), but also to common events like running into resource limits, connections being pruned by the connection manager, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rephrased this and removed the specific reference to protocol errors.

error-codes/README.md Outdated Show resolved Hide resolved
error-codes/README.md Outdated Show resolved Hide resolved
@sukunrt
Copy link
Member Author

sukunrt commented Sep 11, 2024

@marten-seemann

You probably want to create a registry, so application protocols can register ranges of values. Otherwise, it's likely that you'll end up with collisions.

Do we need to? Application protocols can have conflicting error codes. From the application context, it's clear which is the relevant error. I am planning to reserve a space for libp2p to use for its errors and let applications do whatever they like with the application error codes space.
This is similar to how QUIC does it. For example,
DNS over QUIC: https://www.rfc-editor.org/rfc/rfc9250.html
RTP over QUIC: https://www.ietf.org/archive/id/draft-ietf-avtcore-rtp-over-quic-11.html
have conflicting error codes.

@marten-seemann
Copy link
Contributor

This is similar to how QUIC does it. For example,
DNS over QUIC: https://www.rfc-editor.org/rfc/rfc9250.html
RTP over QUIC: https://www.ietf.org/archive/id/draft-ietf-avtcore-rtp-over-quic-11.html
have conflicting error codes.

They're not conflicting, since the application protocol is negotiated during the QUIC handshake.

From the application context, it's clear which is the relevant error.

This is not correct, thanks to multistream.

@sukunrt
Copy link
Member Author

sukunrt commented Sep 11, 2024

I see. All connections are just libp2p connections and they can speak multiple application protocols any of which may close the underlying connection on error.

@sukunrt sukunrt self-assigned this Nov 18, 2024
@sukunrt sukunrt force-pushed the error-codes branch 3 times, most recently from 196f49e to 168a610 Compare November 19, 2024 10:09
@sukunrt sukunrt marked this pull request as ready for review November 19, 2024 10:15
```

### Multistream Select
Multistream-Select is used to negotiate Security protocol for TCP connections before a stream muxer has been selected. There's only one error code defined for such cases, `PROTOCOL_NEGOTIATION_FAILED`. To encode this error, send the string `101` prefixed with the length and close the TCP connection.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not too useful. The most common case here is a server rejecting a connection because the handling it exceeds some resource limit. In such cases Writing on the connection and closing will cause the write to be dropped because almost certainly there will be unread data in the read buffer.

## Error Codes Registry
Libp2p connections are shared by multiple applications. The same connection used in the dht may be used for gossip sub, or for any other application. Any of these applications can close the underlying connection on an error, resetting streams used by the other applications. To correctly distinguish which application closed the connection, Connection Close error codes are allocated to applications from a central registry.

For simplicity, we manage both Connection Close and Stream Reset error codes from a central registry. The libp2p error codes registry is at: https://github.com/libp2p/error-codes/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there much value in having this be a separate repo? Why not have it in the specs repo in a sub folder of error-codes?

Connection Close error code delivery to the other end depends on the OS TCP implementation and the TCP options used for the socket. In particular, when `SO_LINGER` TCP option is set to 0 and the implementation closes the connection immediately after writing the error code containing frame, the error code may not be delivered.

### WebRTC
There is no way to provide any information on closing a peer connection in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording here is confusing.

Libp2p streams are reset unilaterally, calling `Reset` on a stream resets both the read and write end of a stream. For transports, like QUIC, which support cancelling the read and write ends of the stream separately, implementations MAY provide the ability to signal error codes separately on resetting either end.

## Error Codes Registry
Libp2p connections are shared by multiple applications. The same connection used in the dht may be used for gossip sub, or for any other application. Any of these applications can close the underlying connection on an error, resetting streams used by the other applications. To correctly distinguish which application closed the connection, Connection Close error codes are allocated to applications from a central registry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should outline a process to allocate these error codes. e.g. something as simple as:

  • A user requests a certain number of error codes.
  • We ensure it's a reasonable request.
    • Verify the number is reasonable (e.g. not 16k errors)
    • Verify there is no conflict.
  • Allocate that in a high number space.

The process exists so that we can make it easier for future reviewers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

Proposal: provide error codes when closing connections and resetting streams
5 participants