Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc change: Remove mention of fnv in HashMap #77996

Merged
merged 3 commits into from
Nov 13, 2020
Merged

Doc change: Remove mention of fnv in HashMap #77996

merged 3 commits into from
Nov 13, 2020

Conversation

tkaitchuck
Copy link
Contributor

@tkaitchuck tkaitchuck commented Oct 15, 2020

Disclaimer: I am the author of aHash.

This changes the Rustdoc in HashMap from mentioning the fnv crate to mentioning the aHash crate, as an alternative Hasher implementation.

Why

Fnv has poor hash quality, is slow for larger keys, and does not provide dos resistance, because it is unkeyed (this can also cause other problems).

Fnv has acceptable performance for integers and has very poor performance with keys >32 bytes. This is the reason it was removed from the standard library in #37229 .

Because regardless of which dimension you value, there are better alternatives, it does not make sense for anyone to consider using fnv.

The text mentioning fnv in the standard library continues to create confusion: rust-lang/hashbrown#153 rust-lang/hashbrown#9 . There are also a number of crates using it a great many of which are hashing strings (Which is when Fnv is the worst, possible, choice.)

I think aHash makes the most sense to mention as an alternative because it is the most credible option (in my obviously biased opinion). It offers good performance on numbers and strings, is of high quality, and provides dos resistance. It is popular (see stats) and is the default hasher for hashbrown and dashmap which are the most popular alternative hashmaps. Finally it does not have any of the gotcha cases that FxHash suffers from. (Which is the other popular hashing option when DOS attacks are not a concern)

Signed-off-by: Tom Kaitchuck tom.kaitchuck@emc.com

…ve hasher.

Signed-off-by: Tom Kaitchuck <tom.kaitchuck@emc.com>
@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @LukasKalbertodt (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 15, 2020
@jonas-schievink
Copy link
Contributor

I'd rather not use libstd to promote hash functions that haven't seen much review, especially since aHash makes claims about DOS-resistance.

@Mark-Simulacrum
Copy link
Member

cc @rust-lang/libs

Historically we've tried to not mention external crates in std docs; I think it may make sense to drop fnv here. One of the options might be to point at some summary post, though I worry that it will be an endless treadmill of updating it. I guess one option is something like https://en.wikipedia.org/wiki/List_of_hash_functions.

@sfackler
Copy link
Member

I would also prefer to link to either hash functions in general or a widely used and understood one.

@jyn514 jyn514 added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Oct 16, 2020
@jyn514
Copy link
Member

jyn514 commented Oct 16, 2020

@tkaitchuck FYI I tried changing rustc_data_structures to ahash just now and it gave

error[E0277]: the trait bound `K: Hash` is not satisfied
  --> compiler/rustc_data_structures/src/sso/map.rs:89:29
   |
89 |             SsoHashMap::Map(FxHashMap::with_capacity_and_hasher(cap, Default::default()))
   |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `Hash` is not implemented for `K`
   |
   = note: required by `AHashMap::<K, V, S>::with_capacity_and_hasher`

Consider relaxing that bound, I think you shouldn't need it for constructing the hashmap, only for operations involving the elements (get/set/entry).

@jyn514
Copy link
Member

jyn514 commented Oct 16, 2020

Same for various other functions, new(), capacity(), len(), etc.

@tkaitchuck
Copy link
Contributor Author

tkaitchuck commented Oct 16, 2020

@jyn514

@tkaitchuck FYI I tried changing rustc_data_structures to ahash just now and it gave

The error message there indicates the code is using an FxHashmap which is by definition using FxHash. If you want to switch that code, you will need to update it. (If there are suggestions please open an issue on the repo rather than adding more here.)

@tkaitchuck
Copy link
Contributor Author

@Mark-Simulacrum

I guess one option is something like https://en.wikipedia.org/wiki/List_of_hash_functions.

Or perhaps an external list of Rust crates that implement the interface?

Copy link
Member

@LukasKalbertodt LukasKalbertodt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the previous comments: I'd rather not mention aHash in these docs. However, removing the mention of fnv makes sense to me. I'm fine with either of those:

  • Just removing the mention of fnv.
  • Linking Wikipedia.
  • Linking to the "hash" keyword on crates.io (this includes lots of hashing libraries that cannot be used with the std hashmap; so maybe that would confuse readers.)

@tkaitchuck tkaitchuck changed the title Doc change: Change mention of fnv in HashMap as an alternative hasher to aHash Doc change: Remove mention of fnv in HashMap Oct 28, 2020
Signed-off-by: Tom Kaitchuck <Tom.Kaitchuck@gmail.com>
@tkaitchuck
Copy link
Contributor Author

@LukasKalbertodt I have linked to the "hasher" keyword on crates.io. This avoids the problem of "hash" which contains lots of crates that don't implement the interface. Currently there are only 6 crates which have this keyword all but 1 provides an implementation of the interface. Of course many (most) of the popular alternatives do not have this keyword tagged to them, but that is easy to fix on the individual crates.

@LukasKalbertodt
Copy link
Member

The hasher keyword indeed seems to be closer to what we want. However, until the popular crates add this keyword too, this is also not optimal.

@jonas-schievink @Mark-Simulacrum @sfackler What do you think about linking to this keyword?

@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Oct 28, 2020

hasher keyword seems to contain hashers, but it's... very short (6 crates) and neither ahash, fnv, or any of the other "immediate" candidates I'd reach for are in there, despite being available in general. I think the hash keyword is closer to the right one right now, but maybe the Wikipedia link is preferable.

Edit: I also think that we can link to "hasher" but we should update crates first, and it creates a sort of "first mover" problem in some sense (e.g., if ahash ends up being first and only when this is merged). But maybe that's not too big a concern.

tkaitchuck added a commit to tkaitchuck/rustc-hash that referenced this pull request Nov 7, 2020
tkaitchuck added a commit to tkaitchuck/highway-rs that referenced this pull request Nov 7, 2020
tkaitchuck added a commit to tkaitchuck/wyhash-rs that referenced this pull request Nov 7, 2020
tkaitchuck added a commit to tkaitchuck/twox-hash that referenced this pull request Nov 7, 2020
@tkaitchuck
Copy link
Contributor Author

@Mark-Simulacrum I have opened several PRs.

eldruin pushed a commit to eldruin/wyhash-rs that referenced this pull request Nov 7, 2020
@m-ou-se m-ou-se assigned m-ou-se and unassigned LukasKalbertodt Nov 12, 2020
Co-authored-by: Mara Bos <m-ou.se@m-ou.se>
@m-ou-se
Copy link
Member

m-ou-se commented Nov 13, 2020

Thanks!

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Nov 13, 2020

📌 Commit 4e58483 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 13, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 13, 2020
…laumeGomez

Rollup of 6 pull requests

Successful merges:

 - rust-lang#77151 (Add regression test for issue rust-lang#76042)
 - rust-lang#77996 (Doc change: Remove mention of `fnv` in HashMap)
 - rust-lang#78463 (Add type to `ConstKind::Placeholder`)
 - rust-lang#78984 (Rustdoc check option)
 - rust-lang#78985 (add dropck test for const params)
 - rust-lang#78996 (add explicit test for const param promotion)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit ef32ef7 into rust-lang:master Nov 13, 2020
@rustbot rustbot added this to the 1.50.0 milestone Nov 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants