Properly calculate compressed message lengths #833

tmthrgd · 2018-11-27T23:28:28Z

This entirely replaces the rather hacky and very confusing dance done in compressionLenSlice, with a new approach that is far more like how packing is performed. It fixes a number of different bugs (some listed at the end).

The new code has two allocations that aren't actually needed, but are hard to avoid. (MsgLength & MsgLengthMassive should have 0 and 9 allocations respectively). The problem is that compression map[string]struct{} is passed to RR.len where RR is an interface. This interacts badly with Golang's escape analysis (see golang/go#19361 and https://npat-efault.github.io/programming/2016/10/10/escape-analysis-and-interfaces.html), forcing the map to be heap allocated instead of stack allocated. It is possible to force these to be devirtualised by using a massive type switch statement, this worsens performance overall for MsgLengthMassive while eliminating the loss for MsgLength.

The performance loss in MsgLengthNoCompression & MsgLengthOnlyQuestion is (I believe) due to the call to domainNameLen which can't be inlined currently. Though this is unfortunate, the overhead is necessary as it fixes a bug where Len could previously be off by one for uncompressed messages. (See the existing TestMsgLength2 that is now fixed). Inlining changes are coming to go1.12 so this can be revisited then and it may be possible to recover this performance loss.

Some of the benchmarks below (like LenLabels and MuxMatch) are just noise (although I don't quite understand how they survived through benchstat and ten runs), but they could come down to the compiler laying out code differently and cache interactions. TLDR; they're a distraction.

When taken along with #820, MsgLengthMassive is now twice as fast, with half the allocated bytes and 92% fewer allocations.

$ benchstat {old,new}.bench

$ benchstat {old,new}.bench
name                            old time/op    new time/op    delta
MsgLength-12                       336ns ± 0%     545ns ± 6%   +62.24%  (p=0.001 n=6+8)
MsgLengthNoCompression-12         9.95ns ± 3%   26.64ns ± 2%  +167.79%  (p=0.000 n=10+10)
MsgLengthPack-12                  2.20µs ±15%    2.30µs ±23%      ~     (p=0.315 n=10+10)
MsgLengthMassive-12               54.0µs ± 5%    31.8µs ± 6%   -41.04%  (p=0.000 n=10+9)
MsgLengthOnlyQuestion-12          5.53ns ± 1%    9.19ns ± 1%   +66.15%  (p=0.000 n=10+10)
PackDomainName-12                  211ns ± 6%     204ns ± 7%      ~     (p=0.064 n=10+10)
UnpackDomainName-12                150ns ±14%     157ns ±14%      ~     (p=0.383 n=10+10)
UnpackDomainNameUnprintable-12     145ns ± 3%     150ns ± 4%      ~     (p=0.089 n=9+10)
Copy-12                            880ns ±20%     742ns ±55%      ~     (p=0.247 n=10+10)
PackA-12                          41.1ns ± 2%    40.9ns ± 1%    -0.56%  (p=0.044 n=10+9)
UnpackA-12                         222ns ±21%     208ns ±41%      ~     (p=0.726 n=10+10)
PackMX-12                         75.1ns ± 1%    73.4ns ± 1%    -2.24%  (p=0.000 n=10+10)
UnpackMX-12                        278ns ± 8%     278ns ±16%      ~     (p=0.913 n=8+10)
PackAAAAA-12                      40.6ns ± 2%    40.2ns ± 4%      ~     (p=0.055 n=10+10)
UnpackAAAA-12                      243ns ±12%     230ns ± 9%      ~     (p=0.129 n=9+7)
PackMsg-12                        1.57µs ± 8%    1.57µs ±14%      ~     (p=0.853 n=10+10)
PackMsgOnlyQuestion-12             234ns ± 1%     237ns ± 2%    +1.14%  (p=0.046 n=9+9)
UnpackMsg-12                      1.38µs ±29%    1.37µs ±19%      ~     (p=0.853 n=10+10)
IdGeneration-12                   15.7ns ± 2%    15.7ns ± 0%      ~     (p=0.666 n=10+7)
Generate-12                        153µs ± 2%     153µs ± 1%      ~     (p=0.739 n=10+10)
SplitLabels-12                    69.6ns ±12%    62.0ns ±29%      ~     (p=0.063 n=10+10)
LenLabels-12                      27.2ns ± 1%    20.1ns ± 1%   -25.96%  (p=0.000 n=10+8)
CompareDomainName-12               166ns ±17%     161ns ±11%      ~     (p=0.343 n=10+10)
IsSubDomain-12                     529ns ± 3%     493ns ±20%      ~     (p=0.327 n=8+10)
UnpackString-12                    137ns ± 6%     131ns ± 9%      ~     (p=0.089 n=9+10)
Dedup-12                          1.94µs ± 8%    1.86µs ±11%      ~     (p=0.128 n=10+10)
NewRR-12                          2.64µs ± 5%    2.62µs ± 5%      ~     (p=0.289 n=10+10)
ReadRR-12                         4.28µs ±20%    4.10µs ±19%      ~     (p=0.684 n=10+10)
ParseZone-12                       123µs ±14%     115µs ±48%      ~     (p=0.971 n=10+10)
ZoneParser-12                     10.4µs ± 1%    10.6µs ± 4%      ~     (p=0.243 n=9+10)
MuxMatch/lowercase-12             69.7ns ± 2%    67.9ns ± 2%    -2.65%  (p=0.000 n=10+10)
MuxMatch/uppercase-12              120ns ± 4%     125ns ± 3%    +4.26%  (p=0.000 n=10+10)
Serve-12                          56.5µs ± 8%    56.4µs ± 4%      ~     (p=0.853 n=10+10)
Serve6-12                         57.3µs ± 5%    57.7µs ± 5%      ~     (p=0.684 n=10+10)
ServeCompress-12                  57.8µs ± 2%    58.5µs ± 3%      ~     (p=0.800 n=2+3)
SprintName-12                      218ns ± 2%     206ns ± 2%    -5.42%  (p=0.000 n=10+10)
SprintTxtOctet-12                  234ns ± 5%     233ns ±12%      ~     (p=0.858 n=9+10)
SprintTxt-12                       267ns ± 8%     277ns ± 7%      ~     (p=0.147 n=10+10)
 
name                            old alloc/op   new alloc/op   delta
MsgLength-12                       32.0B ± 0%    192.0B ± 0%  +500.00%  (p=0.000 n=10+10)
MsgLengthNoCompression-12          0.00B          0.00B           ~     (all equal)
MsgLengthPack-12                    896B ± 0%      896B ± 0%      ~     (all equal)
MsgLengthMassive-12               14.8kB ± 0%    10.9kB ± 0%   -26.59%  (p=0.000 n=10+10)
MsgLengthOnlyQuestion-12           0.00B          0.00B           ~     (all equal)
PackDomainName-12                  64.0B ± 0%     64.0B ± 0%      ~     (all equal)
UnpackDomainName-12                64.0B ± 0%     64.0B ± 0%      ~     (all equal)
UnpackDomainNameUnprintable-12     48.0B ± 0%     48.0B ± 0%      ~     (all equal)
Copy-12                             432B ± 0%      432B ± 0%      ~     (all equal)
PackA-12                           0.00B          0.00B           ~     (all equal)
UnpackA-12                          100B ± 0%      100B ± 0%      ~     (all equal)
PackMX-12                          0.00B          0.00B           ~     (all equal)
UnpackMX-12                         116B ± 0%      116B ± 0%      ~     (all equal)
PackAAAAA-12                       0.00B          0.00B           ~     (all equal)
UnpackAAAA-12                       100B ± 0%      100B ± 0%      ~     (all equal)
PackMsg-12                          576B ± 0%      576B ± 0%      ~     (all equal)
PackMsgOnlyQuestion-12             64.0B ± 0%     64.0B ± 0%      ~     (all equal)
UnpackMsg-12                        592B ± 0%      592B ± 0%      ~     (all equal)
IdGeneration-12                    0.00B          0.00B           ~     (all equal)
Generate-12                       31.9kB ± 0%    31.9kB ± 0%      ~     (p=1.000 n=10+10)
SplitLabels-12                     32.0B ± 0%     32.0B ± 0%      ~     (all equal)
LenLabels-12                       0.00B          0.00B           ~     (all equal)
CompareDomainName-12               64.0B ± 0%     64.0B ± 0%      ~     (all equal)
IsSubDomain-12                      192B ± 0%      192B ± 0%      ~     (all equal)
UnpackString-12                    48.0B ± 0%     48.0B ± 0%      ~     (all equal)
Dedup-12                            624B ± 0%      624B ± 0%      ~     (all equal)
NewRR-12                            784B ± 0%      784B ± 0%      ~     (all equal)
ReadRR-12                         1.79kB ± 0%    1.79kB ± 0%      ~     (all equal)
ParseZone-12                      84.0kB ± 0%    84.0kB ± 0%      ~     (all equal)
ZoneParser-12                     1.57kB ± 0%    1.57kB ± 0%      ~     (all equal)
MuxMatch/lowercase-12              0.00B          0.00B           ~     (all equal)
MuxMatch/uppercase-12              32.0B ± 0%     32.0B ± 0%      ~     (all equal)
Serve-12                          3.36kB ± 0%    3.36kB ± 0%      ~     (all equal)
Serve6-12                         3.20kB ± 0%    3.20kB ± 0%      ~     (all equal)
ServeCompress-12                  3.62kB ± 0%    3.62kB ± 0%      ~     (all equal)
SprintName-12                      48.0B ± 0%     48.0B ± 0%      ~     (all equal)
SprintTxtOctet-12                  80.0B ± 0%     80.0B ± 0%      ~     (all equal)
SprintTxt-12                       80.0B ± 0%     80.0B ± 0%      ~     (all equal)
 
name                            old allocs/op  new allocs/op  delta
MsgLength-12                        1.00 ± 0%      2.00 ± 0%  +100.00%  (p=0.000 n=10+10)
MsgLengthNoCompression-12           0.00           0.00           ~     (all equal)
MsgLengthPack-12                    8.00 ± 0%      8.00 ± 0%      ~     (all equal)
MsgLengthMassive-12                  138 ± 0%        11 ± 0%   -92.03%  (p=0.000 n=10+10)
MsgLengthOnlyQuestion-12            0.00           0.00           ~     (all equal)
PackDomainName-12                   1.00 ± 0%      1.00 ± 0%      ~     (all equal)
UnpackDomainName-12                 1.00 ± 0%      1.00 ± 0%      ~     (all equal)
UnpackDomainNameUnprintable-12      1.00 ± 0%      1.00 ± 0%      ~     (all equal)
Copy-12                             8.00 ± 0%      8.00 ± 0%      ~     (all equal)
PackA-12                            0.00           0.00           ~     (all equal)
UnpackA-12                          3.00 ± 0%      3.00 ± 0%      ~     (all equal)
PackMX-12                           0.00           0.00           ~     (all equal)
UnpackMX-12                         4.00 ± 0%      4.00 ± 0%      ~     (all equal)
PackAAAAA-12                        0.00           0.00           ~     (all equal)
UnpackAAAA-12                       3.00 ± 0%      3.00 ± 0%      ~     (all equal)
PackMsg-12                          7.00 ± 0%      7.00 ± 0%      ~     (all equal)
PackMsgOnlyQuestion-12              1.00 ± 0%      1.00 ± 0%      ~     (all equal)
UnpackMsg-12                        12.0 ± 0%      12.0 ± 0%      ~     (all equal)
IdGeneration-12                     0.00           0.00           ~     (all equal)
Generate-12                        1.55k ± 0%     1.55k ± 0%      ~     (all equal)
SplitLabels-12                      1.00 ± 0%      1.00 ± 0%      ~     (all equal)
LenLabels-12                        0.00           0.00           ~     (all equal)
CompareDomainName-12                2.00 ± 0%      2.00 ± 0%      ~     (all equal)
IsSubDomain-12                      6.00 ± 0%      6.00 ± 0%      ~     (all equal)
UnpackString-12                     2.00 ± 0%      2.00 ± 0%      ~     (all equal)
Dedup-12                            31.0 ± 0%      31.0 ± 0%      ~     (all equal)
NewRR-12                            15.0 ± 0%      15.0 ± 0%      ~     (all equal)
ReadRR-12                           17.0 ± 0%      17.0 ± 0%      ~     (all equal)
ParseZone-12                        92.0 ± 0%      92.0 ± 0%      ~     (all equal)
ZoneParser-12                       81.0 ± 0%      81.0 ± 0%      ~     (all equal)
MuxMatch/lowercase-12               0.00           0.00           ~     (all equal)
MuxMatch/uppercase-12               1.00 ± 0%      1.00 ± 0%      ~     (all equal)
Serve-12                            54.0 ± 0%      54.0 ± 0%      ~     (all equal)
Serve6-12                           51.0 ± 0%      51.0 ± 0%      ~     (all equal)
ServeCompress-12                    56.0 ± 0%      56.0 ± 0%      ~     (all equal)
SprintName-12                       2.00 ± 0%      2.00 ± 0%      ~     (all equal)
SprintTxtOctet-12                   2.00 ± 0%      2.00 ± 0%      ~     (all equal)
SprintTxt-12                        2.00 ± 0%      2.00 ± 0%      ~     (all equal)

Updates #709
Fixes #821
Fixes #824
Closes #826
Fixes #829

/cc @pierresouchay (who changed much of this in #668).

This wasn't used anywhere but TestCompressionLenSearch, and was very wrong.

This replaces the confusing and complicated compressionLenSlice function.

This leaves the len() functions unused and they'll soon be removed. This also fixes the off-by-one error of compressedLen when a (Q)NAME is ".".

This eliminates the need to loop over the domain name twice when we're compressing the name.

This was a mistake.

These are the only RRs with multiple compressible names within the same RR, and they were previously broken.

It also handles the length of uncompressed domain names.

This should allow us to avoid the call overhead of compressionLenMapInsert in certain limited cases and may result in a slight performance increase. compressionLenMapInsert still has a maxCompressionOffset check inside the for loop.

This better reflects that it also calculates the uncompressed length.

They're both testing the same thing.

codecov-io · 2018-11-27T23:34:03Z

Codecov Report

Merging #833 into master will increase coverage by 0.33%.
The diff coverage is 46.82%.

@@            Coverage Diff             @@
##           master     #833      +/-   ##
==========================================
+ Coverage    57.6%   57.94%   +0.33%     
==========================================
  Files          43       42       -1     
  Lines       10839    10661     -178     
==========================================
- Hits         6244     6177      -67     
+ Misses       3505     3396     -109     
+ Partials     1090     1088       -2

Impacted Files	Coverage Δ
privaterr.go	`67.56% <0%> (-2.86%)`	⬇️
dnssec.go	`58.27% <100%> (ø)`	⬆️
msg.go	`78.14% <100%> (-0.46%)`	⬇️
sig0.go	`65.51% <100%> (ø)`	⬆️
dns.go	`62.5% <100%> (ø)`	⬆️
tsig.go	`41.5% <100%> (ø)`	⬆️
edns.go	`25.09% <100%> (ø)`	⬆️
ztypes.go	`45.48% <33.33%> (+0.79%)`	⬆️
types.go	`73.63% <76.92%> (+0.07%)`	⬆️
server.go	`65.96% <0%> (-0.24%)`	⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c03911...178611e. Read the comment docs.

pierresouchay · 2018-11-28T00:01:49Z

I had a quick look, it seems reasonable, and removing my crappy hacks is good news ;)
I'll try to bench the performance effects in Consul once merged

compressionLenSearch does everything compressionLenMapInsert did anyway.

The last two commits worsened the performance of domainNameLen noticably, this change restores it's original performance. name old time/op new time/op delta MsgLength-12 550ns ±13% 510ns ±21% ~ (p=0.050 n=10+10) MsgLengthNoCompression-12 26.9ns ± 2% 27.0ns ± 1% ~ (p=0.198 n=9+10) MsgLengthPack-12 2.30µs ±12% 2.26µs ±16% ~ (p=0.739 n=10+10) MsgLengthMassive-12 32.9µs ± 7% 32.0µs ±10% ~ (p=0.243 n=9+10) MsgLengthOnlyQuestion-12 9.60ns ± 1% 9.20ns ± 1% -4.16% (p=0.000 n=9+9)

miekg · 2018-11-28T08:02:00Z

[ Quoting <notifications@github.com> in "Re: [miekg/dns] Properly calculate ..." ]

I had a quick look, it seems reasonable, and removing my crappy hacks is good news ;) I'll try to bench the performance effects in Consul once merged

Why not before? ;-) /Miek

…

-- Miek Gieben

miekg · 2018-11-28T10:56:58Z

this looks reasonable; I'll have to do a local checkout to take a closer look though

This was introduced when resolving merge conflicts.

tmthrgd added 17 commits November 28, 2018 08:04

Remove fullSize return from compressionLenSearch

f5f0afb

This wasn't used anywhere but TestCompressionLenSearch, and was very wrong.

Add generated compressedLen functions and use them

3ebc782

This replaces the confusing and complicated compressionLenSlice function.

Use compressedLenWithCompressionMap even for uncompressed

f156b18

This leaves the len() functions unused and they'll soon be removed. This also fixes the off-by-one error of compressedLen when a (Q)NAME is ".".

Use Len helper instead of RR.len private method

6f4013f

Merge len and compressedLen functions

03c7fdd

Merge compressedLen helper into Msg.Len

f8ef961

Remove compress bool from compressedLenWithCompressionMap

f8315d9

Merge map insertion into compressionLenSearch

ccea1b1

This eliminates the need to loop over the domain name twice when we're compressing the name.

Use compressedNameLen for NSEC.NextDomain

aa37c1f

This was a mistake.

Remove compress from RR.len

9b0c2ab

Add test case for multiple questions length

538f8af

Add test case for MINFO and SOA compression

1d7b38a

These are the only RRs with multiple compressible names within the same RR, and they were previously broken.

Rename compressedNameLen to domainNameLen

73cdf22

It also handles the length of uncompressed domain names.

Use off directly instead of len(s[:off])

8290c1b

Rename compressedLenWithCompressionMap to msgLenWithCompressionMap

63ec51c

This better reflects that it also calculates the uncompressed length.

Merge TestMsgCompressMINFO with TestMsgCompressSOA

8303807

They're both testing the same thing.

tmthrgd requested a review from miekg November 27, 2018 23:28

tmthrgd mentioned this pull request Nov 27, 2018

Add CachedLen() #709

Closed

tmthrgd added 3 commits November 28, 2018 12:39

Remove compressionLenMapInsert

c7ce4cc

compressionLenSearch does everything compressionLenMapInsert did anyway.

Only call compressionLenSearch in one place in domainNameLen

c1eb829

tmthrgd mentioned this pull request Nov 28, 2018

Add NULL record #840

Merged

Remove stray newline from TestMsgCompressionMultipleQuestions

d0d69c5

This was referenced Nov 28, 2018

Escaped names compression length mismatch #841

Closed

Put escaped names into compression map #842

Merged

tmthrgd mentioned this pull request Nov 29, 2018

Question compression #821

Closed

miekg approved these changes Nov 29, 2018

View reviewed changes

tmthrgd added 2 commits November 30, 2018 09:51

Merge branch 'master' into new-comp-len

dee293f

Remove stray newline in length_test.go

178611e

This was introduced when resolving merge conflicts.

tmthrgd merged commit 778aa4f into miekg:master Nov 29, 2018

tmthrgd deleted the new-comp-len branch November 29, 2018 23:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Properly calculate compressed message lengths #833

Properly calculate compressed message lengths #833

tmthrgd commented Nov 27, 2018 •

edited

Loading

codecov-io commented Nov 27, 2018 •

edited

Loading

pierresouchay commented Nov 28, 2018

miekg commented Nov 28, 2018 via email

miekg commented Nov 28, 2018

Properly calculate compressed message lengths #833

Properly calculate compressed message lengths #833

Conversation

tmthrgd commented Nov 27, 2018 • edited Loading

codecov-io commented Nov 27, 2018 • edited Loading

Codecov Report

pierresouchay commented Nov 28, 2018

miekg commented Nov 28, 2018 via email

miekg commented Nov 28, 2018

tmthrgd commented Nov 27, 2018 •

edited

Loading

codecov-io commented Nov 27, 2018 •

edited

Loading