-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smarter HashMap/HashSet pre-allocation for extend/from_iter #38017
Conversation
r? @aturon (rust_highfive has picked a reviewer for you, use r? to override) |
let iter = iter.into_iter(); | ||
let hint = iter.size_hint().0; | ||
self.reserve((hint / 2) + (hint & 1)); | ||
for (k, v) in iter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not compile :P
Might be simpler to just call down to self.map.extend(iter.into_iter().map(|k| (k, ()))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally butchered the set part... will amend in a bit.
e16ca98
to
a3e58ca
Compare
I wonder if it should try to take into consideration the upper bound. |
// will only resize twice in the worst case. | ||
let iter = iter.into_iter(); | ||
let hint = iter.size_hint().0; | ||
self.reserve((hint / 2) + (hint & 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually to round up you add (denom-1) before the division, as in (hint + 1) / 2
.
(Note that the current code is correct, it is just a little unusual and it would require nontrivial changes if the divisor was changed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes much more sense, will fix.
What I wanted to do here was to try this on our favourite hashmap workload (rustc) and see what the effect is. Haven't had time to do it yet. |
That's a great idea, thanks! |
It's noisy, results are a bit all over the place (So they are about the same).
For example, we can retry piston-image and syntex:
Why is syntex so quick, what happened with that? I also looked at memory usage in time-passes for piston-image and it is the same before and after. |
Thanks @bluss! If we want to be extra sure not to regress we can revert the from_iter part, or, reserve the entire hint on extend if the hashmap is empty, since it's essentially the same runtime cost (except memory of course). |
According to the result, something is regressing in the super small crates (hello world, issue-32062-equ, and syntex) and the rest are as good as unchanged. |
I changed it to reserve everything if the map is empty, so if the regression was due to slower from_iter it should now be fixed. |
@bors r+ Let's go with this; extend will reserve the size hint's lower bound by half. |
📌 Commit 2c5d240 has been approved by |
@bors r- Oops, I guess T-libs means that the libs team want to discuss it first? |
@bors: r=bluss oh I actually just meant that as categorization, this is fine to go through without libs discussion :) |
📌 Commit 2c5d240 has been approved by |
great, thanks for the info |
⌛ Testing commit 2c5d240 with merge 5f128ed... |
Smarter HashMap/HashSet pre-allocation for extend/from_iter HashMap/HashSet from_iter and extend are making totally different assumptions. A more balanced decision may allocate half the lower hint (rounding up). For "well defined" iterators this effectively limits the worst case to two resizes (the initial reserve + one resize). cc #36579 cc @bluss
HashMap/HashSet from_iter and extend are making totally different assumptions.
A more balanced decision may allocate half the lower hint (rounding up). For "well defined" iterators this effectively limits the worst case to two resizes (the initial reserve + one resize).
cc #36579
cc @bluss