Performance Improvement in tds and ucs2 parsing - V1-Candidate #14

PeteBassettBet365 · 2022-05-04T09:33:22Z

This pull request covers some changes to improve performance in the tdsBuffer struct and ucs22str function.

buf.go
The tdsBuffer uint16 function is used heavily throughout the tds parsing code. The original implementation created a byte array and a slice
into it to be filled by a call to ReadFull. This calls through io.ReadFull back into tdsBuffer io.Reader Read function. This code path incurs
significant overhead when the buffer already has at least 2 bytes available. The new implementation checks the read position before calling
ReadFull. If we have at least two bytes available we read directly, otherwise fall back to the slow path.
uint32 and uint64 have also been changed in the same way for 4 & 8 byte reads.

uint16 - a loop calling uint16 10000 times
BenchmarkReadUint16Implementations/reference-12 4010 304633 ns/op 20045 B/op 10001 allocs/op
BenchmarkReadUint16Implementations/new_version-12 23412 50691 ns/op 32 B/op 1 allocs/op

uint32 - a loop calling uint32 5000 times
BenchmarkReadUint32Implementations/reference-12 6684 154147 ns/op 20045 B/op 5001 allocs/op
BenchmarkReadUint32Implementations/new_version-12 48321 25327 ns/op 32 B/op 1 allocs/op

uint64 - a loop calling uint64 2500 times
BenchmarkReadUint64Implementations/reference-12 3795 87688 ns/op 20045 B/op 2501 allocs/op
BenchmarkReadUint64Implementations/new_version-12 85941 14582 ns/op 32 B/op 1 allocs/op

tds.go - performance improvement on ucs22str function. It is responsible for reading 16bit ucs2 characters and producing a string.
Most of the strings we deal with are ascii so the full unicode handling path can be bypassed when they are found. We use unsafe to read the incoming bytes
in 8 byte chunks, as uint64s, masking the value to find invalid ascii. If found we fall back to a faster version of the original
implementation to deal with the unicode. Otherwise we have 8 bytes and 4 can be safely added to the output buffer. If we reach the end of the input
successfully the buffer can be converted directly to a string. This reduces allocations on the ascii path and is a significant speed improvement, up to 16x in our tests.
The faster unicode path avoids copying the byte data to a newly allocated uint16 array. It again uses unsafe, to create a uint16 slice to the original byte data.
This is safe because it is passed to directly utf16.Decode, which itself creates a copy as a []rune. This saves us an allocation, copying and uint16 conversion.

Reference Implementation
BenchmarkUcs22strReferenceAscii-12 12274105 98.6 ns/op 32 B/op 3 allocs/op
BenchmarkUcs22strReferenceMediumAscii-12 6721837 178 ns/op 96 B/op 3 allocs/op
BenchmarkUcs22strReferenceLongAscii-12 1538629 778 ns/op 448 B/op 3 allocs/op
BenchmarkUcs22strReferenceLongerAscii-12 169466 7115 ns/op 4608 B/op 3 allocs/op
BenchmarkUcs22strReferenceTrailingUnicode-12 4423173 274 ns/op 160 B/op 3 allocs/op
BenchmarkUcs22strReferenceLongEmojis-12 3166297 380 ns/op 304 B/op 3 allocs/op

New Implementation
BenchmarkUcs22strAscii-12 54690384 21.9 ns/op 3 B/op 1 allocs/op
BenchmarkUcs22strMediumAscii-12 35389251 33.4 ns/op 16 B/op 1 allocs/op
BenchmarkUcs22strLongAscii-12 17188800 70.1 ns/op 64 B/op 1 allocs/op
BenchmarkUcs22strLongerAscii-12 2864568 427 ns/op 640 B/op 1 allocs/op
BenchmarkUcs22strTrailingUnicode-12 4221280 259 ns/op 144 B/op 3 allocs/op
BenchmarkUcs22strLongEmojis-12 3549286 341 ns/op 272 B/op 3 allocs/op

buf.go
A tdsBuffer is created for each sql connection and internally holds two []byte buffers, one for reading and one for writing. They are allocated directly with
make. Each time a connection is closed, they are garbage collected. We added a sync.Pool managed pool of 64k []byte slices. These are created as required and
placed back in the pool when the tdsBuffer is closed. Each pooled slice is used for the read and write buffers by reslicing half for each.
Connections are generally long lived so this is a minor improvement to garbage collection overhead at the expense of slightly increased memory usage. Happy to discuss this further if you deem it not required.

All benchmarks done on an 8core i7, 16gb

ghost · 2022-05-04T09:33:33Z

All CLA requirements met.

stuartpa · 2022-06-15T13:43:45Z

tds.go

+	// step through in 8 byte chunks.
+	for readIndex = 0; readIndex < nlen8; readIndex += 8 {
+
+		// defererence directly into the array as uint64s


defererence

nit: typo #Resolved

fixed that now

shueybubbles

Please address the lint issues. Looks good to me!
thx

PeteBassettBet365 · 2022-06-17T15:41:54Z

Hi, I will fix the linting issues on Monday.

codecov-commenter · 2022-06-17T15:53:50Z

Codecov Report

❗ No coverage uploaded for pull request base (main@2c7e9d5). Click here to learn what that means.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main      #14   +/-   ##
=======================================
  Coverage        ?   72.03%           
=======================================
  Files           ?       23           
  Lines           ?     5593           
  Branches        ?        0           
=======================================
  Hits            ?     4029           
  Misses          ?     1318           
  Partials        ?      246

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c7e9d5...1ebb25d. Read the comment docs.

buf.go

shueybubbles

Looks good to me. When @stuartpa signs off I'll merge.

stuartpa

PeteBassettBet365 added 3 commits May 4, 2022 10:19

Perf improfement tds and decoding ucs2 strings

7ff0717

Reduce pooled buffer size

daa0733

fix unit test expected value message

a5b9dbc

PeteBassettBet365 changed the title ~~Performance Improvement in tds and ucs2 parsing~~ Performance Improvement in tds and ucs2 parsing - V1-Candidate May 4, 2022

shueybubbles requested a review from stuartpa May 11, 2022 17:57

shueybubbles added enhancement New feature or request v1-candidate Nominate PR or Issue for V1 project labels May 11, 2022

shueybubbles requested review from gambtho and imiller31 May 11, 2022 18:10

shueybubbles requested a review from kenvanhyning May 26, 2022 14:57

stuartpa reviewed Jun 15, 2022

View reviewed changes

shueybubbles requested changes Jun 15, 2022

View reviewed changes

fix typo in comment

1ebb25d

fix linting issues

957cced

shueybubbles reviewed Jun 20, 2022

View reviewed changes

buf.go Show resolved Hide resolved

shueybubbles approved these changes Jun 20, 2022

View reviewed changes

stuartpa approved these changes Jun 21, 2022

View reviewed changes

shueybubbles merged commit f386d5f into microsoft:main Jun 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Improvement in tds and ucs2 parsing - V1-Candidate #14

Performance Improvement in tds and ucs2 parsing - V1-Candidate #14

PeteBassettBet365 commented May 4, 2022

ghost commented May 4, 2022 •

edited by ghost

Loading

stuartpa Jun 15, 2022 •

edited

Loading

PeteBassettBet365 Jun 17, 2022

shueybubbles left a comment

PeteBassettBet365 commented Jun 17, 2022 •

edited

Loading

codecov-commenter commented Jun 17, 2022 •

edited

Loading

shueybubbles left a comment

stuartpa left a comment

Performance Improvement in tds and ucs2 parsing - V1-Candidate #14

Performance Improvement in tds and ucs2 parsing - V1-Candidate #14

Conversation

PeteBassettBet365 commented May 4, 2022

ghost commented May 4, 2022 • edited by ghost Loading

stuartpa Jun 15, 2022 • edited Loading

Choose a reason for hiding this comment

PeteBassettBet365 Jun 17, 2022

Choose a reason for hiding this comment

shueybubbles left a comment

Choose a reason for hiding this comment

PeteBassettBet365 commented Jun 17, 2022 • edited Loading

codecov-commenter commented Jun 17, 2022 • edited Loading

Codecov Report

shueybubbles left a comment

Choose a reason for hiding this comment

stuartpa left a comment

Choose a reason for hiding this comment

ghost commented May 4, 2022 •

edited by ghost

Loading

stuartpa Jun 15, 2022 •

edited

Loading

PeteBassettBet365 commented Jun 17, 2022 •

edited

Loading

codecov-commenter commented Jun 17, 2022 •

edited

Loading