Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pgtype/hstore: Make text parsing about 6X faster
I am working on an application that uses hstore types, and we found that returning the values is slow, particularly when using the text protocol, such as when using database/sql. This improves parsing to be about 6X faster (currently faster than binary). The changes are: * referencing the original string instead of copying into new strings (very large win) * using string.IndexByte to scan double quoted strings: it has architecture-specific assembly implementations, and most of the time is spent in key/value strings. * estimating the number of key/value pairs to allocate the correct size of the slice and map up front. This reduces the number of allocations and bytes allocated by a factor of 2, and was a small CPU win. * parsing directly into the Hstore, rather than copying into it. This parser is stricter than the old one. It only accepts hstore strings serialized by Postgres. The old one was already stricter than Postgres's own parser, but previously accepted any whitespace character after a comma. This one only accepts space. Example: "k1"=>"v1",\t"k2"=>"v2" Postgres only ever uses ", " as the separator. See hstore_out: https://github.com/postgres/postgres/blob/master/contrib/hstore/hstore_io.c The result of using benchstat to compare the benchmark on my M1 Pro with the following command line in below. The new text parser is now faster than the binary parser. I will improve the binary parser in a separate change. for i in $(seq 10); do go test ./pgtype -run=none -bench=BenchmarkHstoreScan -benchtime=1s >> new.txt; done goos: darwin goarch: arm64 pkg: github.com/jackc/pgx/v5/pgtype │ orig.txt │ new.txt │ │ sec/op │ sec/op vs base │ HstoreScan/databasesql.Scan-10 82.11µ ± 1% 10.51µ ± 0% -87.20% (p=0.000 n=10) HstoreScan/text-10 83.30µ ± 1% 11.49µ ± 1% -86.20% (p=0.000 n=10) HstoreScan/binary-10 15.99µ ± 2% 15.77µ ± 1% -1.35% (p=0.007 n=10) geomean 47.82µ 12.40µ -74.08% │ orig.txt │ new.txt │ │ B/op │ B/op vs base │ HstoreScan/databasesql.Scan-10 56.23Ki ± 0% 11.68Ki ± 0% -79.23% (p=0.000 n=10) HstoreScan/text-10 65.12Ki ± 0% 20.58Ki ± 0% -68.40% (p=0.000 n=10) HstoreScan/binary-10 21.09Ki ± 0% 21.09Ki ± 0% ~ (p=0.378 n=10) geomean 42.58Ki 17.18Ki -59.66% │ orig.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ HstoreScan/databasesql.Scan-10 744.00 ± 0% 44.00 ± 0% -94.09% (p=0.000 n=10) HstoreScan/text-10 743.00 ± 0% 44.00 ± 0% -94.08% (p=0.000 n=10) HstoreScan/binary-10 464.0 ± 0% 464.0 ± 0% ~ (p=1.000 n=10) ¹ geomean 635.4 96.49 -84.81% ¹ all samples are equal
- Loading branch information