Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux #12869

Merged
merged 1 commit into from
Jul 31, 2024

Conversation

jwhited
Copy link
Contributor

@jwhited jwhited commented Jul 19, 2024

This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.

A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.

TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks


Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender
[ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver

The second result is from this commit with TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks


Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver

Updates #6816

@jwhited jwhited force-pushed the jwhited/gVisor-offloads branch 4 times, most recently from 1aeaeee to 08921d5 Compare July 24, 2024 02:48
@jwhited jwhited changed the title net/tstun,wgengine/netstack: implement gVisor GSO go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux Jul 24, 2024
@jwhited jwhited marked this pull request as ready for review July 24, 2024 02:52
@jwhited jwhited requested review from kevboh, raggi and sailorfrag July 24, 2024 02:52
@jwhited
Copy link
Contributor Author

jwhited commented Jul 24, 2024

TestNATPing ipv6 integration tests are failing, but all tailscale ping variations and iperf3 over ipv6 seem fine. Will look closer tomorrow.

@jwhited
Copy link
Contributor Author

jwhited commented Jul 24, 2024

SNAT incremental checksum update results in an invalid (partial/pseudoheader) checksum when gVisor feeds us a single segment with NeedsCsum=true and a partial TCP checksum. SNAT could move post-GSO, though not ideal from a performance perspective.

In theory incremental update should still work just fine against a psuedoheader checksum as the additive property of the checksum still applies. SNAT changes the src L3 addr, which is part of the pseudoheader checksum.

36b6a73 demonstrates the problem via unit test

@jwhited
Copy link
Contributor Author

jwhited commented Jul 24, 2024

22ee88a simplifies the test. I don't yet understand why the additive property of checksumming isn't applying the same here.

@jwhited jwhited force-pushed the jwhited/gVisor-offloads branch 2 times, most recently from 70ea9fa to e192d63 Compare July 24, 2024 22:22
@jwhited
Copy link
Contributor Author

jwhited commented Jul 24, 2024

When gVisor hands us a partial transport checksum it must be inverted (ones' complement) prior to SNAT incremental checksum update. NATPing integration tests are now passing after accounting for this, and I'm hands off.

@jwhited
Copy link
Contributor Author

jwhited commented Jul 29, 2024

@raggi @sailorfrag this is ready for review; GRO PR will follow shortly after this one

net/tstun/wrap.go Outdated Show resolved Hide resolved
wgengine/netstack/link_endpoint.go Show resolved Hide resolved
wgengine/netstack/link_endpoint.go Outdated Show resolved Hide resolved
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.

A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.

TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.51 GBytes  2.15 Gbits/sec  154 sender
[  5]   0.00-10.00  sec  2.49 GBytes  2.14 Gbits/sec      receiver

The second result is from this commit with TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec    6 sender
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
@jwhited jwhited force-pushed the jwhited/gVisor-offloads branch from b9f812c to f0016cc Compare July 31, 2024 16:31
@jwhited jwhited merged commit 7bc2dda into main Jul 31, 2024
49 checks passed
@jwhited jwhited deleted the jwhited/gVisor-offloads branch July 31, 2024 16:42
jwhited added a commit that referenced this pull request Aug 15, 2024
…for Linux (#12869)"

This reverts commit 7bc2dda.

Updates tailscale/corp#22348

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Asutorufa pushed a commit to Asutorufa/tailscale that referenced this pull request Aug 23, 2024
tailscale#12869)

This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.

A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.

TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.51 GBytes  2.15 Gbits/sec  154 sender
[  5]   0.00-10.00  sec  2.49 GBytes  2.14 Gbits/sec      receiver

The second result is from this commit with TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec    6 sender
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec      receiver

Updates tailscale#6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants