-
Notifications
You must be signed in to change notification settings - Fork 84
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
hash: switch to simpler "fast algorithm" (#173)
Bob Jenkins's Lookup3 hash function[1] added in commit 9ff3373 ("add odict") might be faster than Bob Jenkins's One-At-A-Time hash function[2] used previously, but its default implementation can trigger crashes as it's reading data after the buffer to process. One has to define VALGRIND when building to prevent the function from reading past a buffer whose size is not a 4 bytes multple. Instead of fixing lookup3 implementation, simply defaulting to "VALGRIND" implementation, 32bits FNV1-a[3] hash was chosen to replace the lookup3 hash, because - it's simpler, - it's smaller, - it's fast enough for short keys, - it's good enough collision-wise. This conclusion is supported by running the Perl testsuite built with various hash algoritm: FNV1-a is faster than One-At-A-time[4] when used to implement a hash table, and it could be faster than other algorithms that seems to be faster than lookup3[4][5]. Faster hash algorithms seems to rely on hardware acceleration (SIMD, AES, etc.), unfortunately that make them more complex and, more important, less portable. Exceptions exist, such as xxHash[6], but it's a large amount of code that hardly justify the cost of importing it. SipHash[7] gains many users over the years, but would be slower than FNV1-a for small strings. So FNV1-a should be enough for re library usage. One-At-A-Time is kept asis because public API rely on HTTP and SIP header identifiers equal to their hashed string value. https://www.burtleburtle.net/bob/hash/#lookup [1] https://www.burtleburtle.net/bob/c/lookup3.c [2] https://www.burtleburtle.net/bob/hash/doobs.html#one [3] http://www.isthe.com/chongo/tech/comp/fnv/ [4] https://github.com/rurban/perl-hash-stats [5] https://github.com/rurban/smhasher [6] https://github.com/Cyan4973/xxHash#small-data
- Loading branch information
Showing
1 changed file
with
18 additions
and
206 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters