-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up DefaultHasher
, SipHasher
, and SipHasher13
.
#69152
Speed up DefaultHasher
, SipHasher
, and SipHasher13
.
#69152
Conversation
This should have negligible effect on rustc's own performance, because rustc uses @bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 66a83ae9f40856e3e4a21c3e3ea2605fca7dbd44 with merge 22f6b02ddd18fb74ee6551aea9a7dde4607f3cd8... |
Rust's Sip hashing is notorious for being slow, hence the existence of |
☀️ Try build successful - checks-azure |
Queued 22f6b02ddd18fb74ee6551aea9a7dde4607f3cd8 with parent 1010408, future comparison URL. |
We have four benchmarks with significant regressions.
They are all I did a diff with Cachegrind on |
I tried a bunch of things such as de-inlining, removing |
@nnethercote Thanks for the PR! I can't really r+ things in the standard library. Would you mind splitting out the |
From the libs perspective, as long as |
The official test vectors would be great. |
I have split out the We have a decision to make about what to do with the rest of the PR. I see three options.
With respect to the final point, I did a local run and got better results than the CI run:
The worst regressions are bigger than the best improvements, but there are quite a few more improvements than regressions. I looked at Another interesting case is I'm leaning towards the third option (full PR) but I'm happy to hear other opinions. cc @rust-lang/wg-compiler-performance |
@nnethercote As this is trading off worse compile-time performance for better run-time performance, it would be nice to have some benchmark of what the actual run-time performance improvement would look like. |
That will depend heavily on the code in question and how much it uses |
I did some perf comparisons with rustc.
rust1 is massively slower than rust0. On rust2 is significantly faster than rust1. On |
This commit changes `sip::Hasher` to use the faster `short_write` approach that was used for `SipHasher128` in rust-lang#68914. This has no effect because `sip::Hasher::short_write` is currently unused. See the next commit for more details, and a fix. (One difference with rust-lang#68914 is that this commit doesn't apply the `u8to64_le` change from that PR, because I found it is slower, because it introduces `memcpy` calls with non-statically-known lengths. Therefore, this commit also undoes the `u8to64_le` change in `SipHasher128` for this reason. This doesn't affect `SipHasher128` much because it doesn't use `u8to64_le` much, but I am making the change to keep the two implementations consistent.)
…, `sip::Hasher`. `sip::Hasher::short_write` is currently unused. It is called by `sip::Hasher::write_{u8,usize}`, but those methods are also unused, because `DefaultHasher`, `SipHasher` and `SipHasher13` don't implement any of the `write_xyz` methods, so all their write operations end up calling `sip::Hasher::write`. I confirmed this by inserting a `panic!` in `sip::Hasher::short_write` and running the tests -- they all passed. This commit adds all the `write_xyz` methods to `DefaultHasher`, `SipHasher`, `SipHasher13` and `sip::Hasher`, which means that `short_write` will be used for all of them. (I haven't properly implemented 128-bit writes in `sip::Hasher` and `SipHasher128`. That could be a follow-up if something thinks it is important.) Some microbenchmarks I wrote show that this commit speeds up default hashing of integers by about 2.5x, and can speed up hash tables that use `DefaultHasher` and have keys that contain integers by up to 30%, though the exact amount depends heavily on the details of the keys and how the table is used. This makes default `Hash{Map,Set}` much more likely to be competitive with `FxHash{Map,Set}`.
66a83ae
to
96c4e42
Compare
I rebased. Let's measure again, see if the perf results vary. @bors try @rust-timer queue |
⌛ Trying commit 96c4e42 with merge 3fd2ff0fe14703eb6d09f9596264ddebb2b4b86d... |
☀️ Try build successful - checks-azure |
Awaiting bors try build completion |
⌛ Trying commit 96c4e42 with merge d4aac45b9c1492286b3fd245a27eb35cbd66a89f... |
☀️ Try build successful - checks-azure |
Queued d4aac45b9c1492286b3fd245a27eb35cbd66a89f with parent 79cd224, future comparison URL. |
The latest CI perf run is slightly worse than the first one, with small regressions (~1%) on more of the benchmarks. |
`sip::Hasher::short_write` is currently unused. It is called by `sip::Hasher::write_{u8,usize}`, but those methods are also unused, because `DefaultHasher`, `SipHasher` and `SipHasher13` don't implement any of the `write_xyz` methods, so all their write operations end up calling `sip::Hasher::write`. (I confirmed this by inserting a `panic!` in `sip::Hasher::short_write` and running the tests -- they all passed.) The alternative would be to add all the missing `write_xyz` methods. This does give some significant speed-ups, but it hurts compile times a little in some cases. See rust-lang#69152 for details. This commit does the conservative thing and doesn't change existing behaviour.
I have created #69471 for this option. I have been having trouble deciding between option 1 and option 3, so I ended up settling on option 1 because it preserves existing behaviour. (I did write a comment explaining the runtime perf win that has been given up.) |
Remove `sip::Hasher::short_write`. `sip::Hasher::short_write` is currently unused. It is called by `sip::Hasher::write_{u8,usize}`, but those methods are also unused, because `DefaultHasher`, `SipHasher` and `SipHasher13` don't implement any of the `write_xyz` methods, so all their write operations end up calling `sip::Hasher::write`. (I confirmed this by inserting a `panic!` in `sip::Hasher::short_write` and running the tests -- they all passed.) The alternative would be to add all the missing `write_xyz` methods. This does give some significant speed-ups, but it hurts compile times a little in some cases. See #69152 for details. This commit does the conservative thing and doesn't change existing behaviour. r? @rust-lang/libs
Marking this as blocked on #69471 |
I think we can just close this in favour of #69471. |
…te, r=dtolnay Remove `sip::Hasher::short_write`. `sip::Hasher::short_write` is currently unused. It is called by `sip::Hasher::write_{u8,usize}`, but those methods are also unused, because `DefaultHasher`, `SipHasher` and `SipHasher13` don't implement any of the `write_xyz` methods, so all their write operations end up calling `sip::Hasher::write`. (I confirmed this by inserting a `panic!` in `sip::Hasher::short_write` and running the tests -- they all passed.) The alternative would be to add all the missing `write_xyz` methods. This does give some significant speed-ups, but it hurts compile times a little in some cases. See rust-lang#69152 for details. This commit does the conservative thing and doesn't change existing behaviour. r? @rust-lang/libs
…te, r=dtolnay Remove `sip::Hasher::short_write`. `sip::Hasher::short_write` is currently unused. It is called by `sip::Hasher::write_{u8,usize}`, but those methods are also unused, because `DefaultHasher`, `SipHasher` and `SipHasher13` don't implement any of the `write_xyz` methods, so all their write operations end up calling `sip::Hasher::write`. (I confirmed this by inserting a `panic!` in `sip::Hasher::short_write` and running the tests -- they all passed.) The alternative would be to add all the missing `write_xyz` methods. This does give some significant speed-ups, but it hurts compile times a little in some cases. See rust-lang#69152 for details. This commit does the conservative thing and doesn't change existing behaviour. r? @rust-lang/libs
…te, r=dtolnay Remove `sip::Hasher::short_write`. `sip::Hasher::short_write` is currently unused. It is called by `sip::Hasher::write_{u8,usize}`, but those methods are also unused, because `DefaultHasher`, `SipHasher` and `SipHasher13` don't implement any of the `write_xyz` methods, so all their write operations end up calling `sip::Hasher::write`. (I confirmed this by inserting a `panic!` in `sip::Hasher::short_write` and running the tests -- they all passed.) The alternative would be to add all the missing `write_xyz` methods. This does give some significant speed-ups, but it hurts compile times a little in some cases. See rust-lang#69152 for details. This commit does the conservative thing and doesn't change existing behaviour. r? @rust-lang/libs
…rr, r=<try> Optimize DefaultHasher siphash let's see how re-applying rust-lang#69152 again goes. imo this is a huge speedup that would be worth some compile time regressions, but i wanna see first. probably won't have the time and energy to argue for it though, if there are significant regressions ^^' cc `@nnethercote`
This PR applies the speedups to
SipHasher128
from #68914 to the Sip hashers inlibcore
, and also adds the missingwrite_*
methods required so that they can benefit from the speedups. Default hashing of integers is now something like 2.5x faster, and default hash tables should be more competitive with hash tables from thefxhash
crate.r? @michaelwoerister