Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relay/dcutr/quic: Alternate host sending garbage UDP packet #487

Open
mxinden opened this issue Nov 29, 2022 · 14 comments
Open

relay/dcutr/quic: Alternate host sending garbage UDP packet #487

mxinden opened this issue Nov 29, 2022 · 14 comments

Comments

@mxinden
Copy link
Member

mxinden commented Nov 29, 2022

Today during direct connection upgrade on QUIC, A sends a client hello, B sends random bytes. The client hello makes it through B's firewall and/or NAT through the hole punched by the random bytes sent by B. B's consecutive server hello makes it through A's firewall and/or NAT through the hole punched by A's client hello.

  • For a QUIC address:
    • Upon receiving the Sync, A immediately dials the address to B.
    • Upon expiry of the timer, B starts to send UDP packets filled with
      random bytes to A's address. Packets should be sent repeatedly in
      random intervals between 10 and 200 ms.
    • This will result in a QUIC connection where A is the client and B is
      the server.

https://github.com/libp2p/specs/blob/master/relay/DCUtR.md#the-protocol

Now assume that B is behind a symmetric NAT but A is not. A's client hello will not make it through B's NAT, given that it (most likely) does not have the same destination port as the translated source port of B's random bytes.

If we would alternate the roles on retries in DCUtR, e.g. have B be the one to send random bytes in round 1 and A be the one to send random bytes in round 2, the above scenario would succeed on the second try, given that B is not behind a symmetric NAT.

Note that we could as well have both A and B send client hellos in the first round. The downside is, that contrary with TCP and simultaneous open, we might end up with two QUIC connections. One from A to B and one from B to A.

//CC @elenaf9 and @dennis-tra as discussed today.

@mxinden mxinden changed the title relay/dcutr/quic: Consider alternating host sending garbage UDP packet relay/dcutr/quic: Alternate host sending garbage UDP packet Nov 29, 2022
@marten-seemann
Copy link
Contributor

Now assume that B is behind a symmetric NAT but B is not. A's client hello will not make it through B's NAT, given that it (most likely) does not have the same destination port as the translated source port of B's random bytes.

If we would alternate the roles on retries in DCUtR, e.g. have B be the one to send random bytes in round 1 and A be the one to send random bytes in round 2, the above scenario would succeed on the second try, given that B is not behind a symmetric NAT.

I'm not sure I understand how this is supposed to work.
First of all, there are no rounds. A and B send their packets at the same time.
In both scenarios, A and B send UDP packets to one specific destination address, respectively. Why does the payload of the UDP packets (random bytes or ClientHello) matter?

@mxinden
Copy link
Member Author

mxinden commented Nov 29, 2022

First of all, there are no rounds. A and B send their packets at the same time.

A and B send their packets at the same time. On failure B retries the DCUtR flow.

On failure of all connection attempts go back to step (1). Inbound peers (here B) SHOULD retry twice (thus a total of 3 attempts) before considering the upgrade as failed.

https://github.com/libp2p/specs/blob/master/relay/DCUtR.md#the-protocol

In both scenarios, A and B send UDP packets to one specific destination address, respectively. Why does the payload of the UDP packets (random bytes or ClientHello) matter?

Say that B is behind a symmetric NAT and A isn't. A sends a client hello to B which is dropped at B's NAT given that A does not know B's NATed port. B's random bytes make it through A's NAT through the hole punched by A's client hello. A discards B's random bytes. End result is no connection.

Say that B is still behind a symmetric NAT and A isn't, BUT B sends a client hello to A and A sends random bytes to B. B's client hello will make it through A's NAT through the hole punched by A's random bytes. A's random bytes are dropped at B's NAT given that A does not know B's NATed port. The latter doesn't matter. A received B's client hello and responds with a server hello. End result is an established connection.

By alternating on retry who sends the random bytes, we would succeed on the second try.

Does the above make sense @marten-seemann?

@marten-seemann
Copy link
Contributor

That makes sense, thank you for the clarification!

Is the proposal to reduce the number retries to 2 (given that @dennis-tra's measurements show that there's no point in trying more than once), and alternate the roles after the first attempt?

@mxinden
Copy link
Member Author

mxinden commented Nov 29, 2022

I am not yet sure what the best strategy would be. Alternating across (re-) tries was the first that came to my mind.

Unfortunately, having both endpoints send client hellos from the start, might result in two connections. If that wouldn't be the case, this would be my favorite strategy, given that it is the fastest.

Is the proposal to reduce the number retries to 2

In my eyes, that is an orthogonal change.

@MarcoPolo
Copy link
Contributor

Now assume that B is behind a symmetric NAT but B is not.

Typo? A is not?

@mxinden
Copy link
Member Author

mxinden commented Nov 29, 2022

Now assume that B is behind a symmetric NAT but B is not.

Typo? A is not?

Thanks for the catch @MarcoPolo. Fixed.

@dennis-tra
Copy link

dennis-tra commented Nov 30, 2022

As you said Marten, the measurement results suggest that if it doesn't work with the first attempt it likely won't work with any subsequent one.

So, I think the optimizations here would be to either

  1. decrease the number of attempts or
  2. change the strategy for subsequent attempts.

What Max suggests here is 2. - to change something in the way that we try to hole punch in the second attempt. Which I also find the better option.

Switching roles of client/server makes sense to me for the reasons that Max explained. However, I have something to consider: If B is behind a symmetric NAT I'd assume that B won't be able to determine its OwnObservedAddrs because the identify protocol would report inconsistent address/port combinations. This would (at least in the current implementation) prohibit a hole punch.

@vyzo
Copy link
Contributor

vyzo commented Nov 30, 2022

My initial measurements had suggested that some conns do go through in the second retry.

A more conservative approach is to do 2 retries, and then try switching the hello to punch through cone-symmetric scenarios, with 1 retry.

@dennis-tra
Copy link

@vyzo This is the data that we are referring to: https://www.notion.so/pl-strflt/NAT-Hole-punching-Success-Rate-2022-09-29-Data-Analysis-8e72705ca3cc49ab983bc5e8792e3e98#c76c6d5e25844bff8c7508b67f236827

This suggests that if we were not successful with the first attempt, there's only a ~3% chance that it'll work with a subsequent attempt.

@vyzo
Copy link
Contributor

vyzo commented Nov 30, 2022

3% is not negligible, please be more conservative in your assessments!

@dennis-tra
Copy link

dennis-tra commented Nov 30, 2022

3% is indeed not negligible. My assumption is that the proposal we are discussing here wouldn't have a negative effect on the 3% for whom it worked with the second attempt but just increase the chances for the ones that weren't lucky with any subsequent attempt. Curious about what the others think.

@marten-seemann
Copy link
Contributor

However, I have something to consider: If B is behind a symmetric NAT I'd assume that B won't be able to determine its OwnObservedAddrs because the identify protocol would report inconsistent address/port combinations. This would (at least in the current implementation) prohibit a hole punch.

I think that’s correct. We also have some logic there to determine the NAT type there, maybe there’s some way to make use of that information?

  1. change the strategy for subsequent attempts.

We need to be careful how we do this in a backwards-compatible way. Legacy nodes will still want to punch multiple times without switching roles. Maybe that’s fine, but maybe we can find some clever way around that.

@sukunrt
Copy link
Member

sukunrt commented Oct 18, 2023

@mxinden:

Say that B is behind a symmetric NAT and A isn't. A sends a client hello to B which is dropped at B's NAT given that A does not know B's NATed port. B's random bytes make it through A's NAT through the hole punched by A's client hello. A discards B's random bytes. End result is no connection.

Say that B is still behind a symmetric NAT and A isn't, BUT B sends a client hello to A and A sends random bytes to B. B's client hello will make it through A's NAT through the hole punched by A's random bytes. A's random bytes are dropped at B's NAT given that A does not know B's NATed port. The latter doesn't matter. A received B's client hello and responds with a server hello. End result is an established connection.

This doesn't work. For A to holepunch through its firewall it needs to send a packet to B's symmetric NATed address. This is not what happens. A sends a packet to a port on B which is not the port that B sends packets out of.

Consider the case:

B tells A its port is Y but the port it'll actually send packets out of is X.
A tells B its port is P and it'll send packets out of P

A: P -> Y
Now A's firewall will allow incoming packets from Y but not from X, so when B
does
B: X -> P this will be dropped by A's firewall.

This packet from B: X->P can only work if A's firewall allows all incoming packets from B's IP address irrespective of the port. This firewall behaviour is probably in the minority because AutoNAT heavily relies on this behaviour and the metrics on bootstrappers do show them replying to a lot of nodes as private.

@sukunrt
Copy link
Member

sukunrt commented Feb 24, 2024

I think there is a way to make this work in case the firewall on the asymmetric(nice) side is permissive.

Let's assume the previous case:
B(Symmetric NAT) tells A its port is Y but the port it'll actually send packets out of is X.
A(ASymmetric NAT) tells B its port is P and it'll send packets out of P

A: P -> Y
If A's firewall is permissive somehow
B: X -> P this will be allowed by A's firewall.

If B sends a non quic packet, A can read this packet and get B's outgoing port Y. Now A can dial B at port Y.

Having said that, I don't know what this means
If A's firewall is permissive somehow
I think A behind such a firewall will just be a public node.

@dhuseby dhuseby moved this to Triage in libp2p Specs May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Triage
Development

No branches or pull requests

6 participants