-
Notifications
You must be signed in to change notification settings - Fork 793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating std::hast<EntityId_t> to get a better unordered containers distribution [6895] #868
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this still lead to an uneven distribution? The hash output is UINT32, but we are using only the 3 least significant octets, leaving a very small subset of values on use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor style comments only
In fact, we are using the 3 MOST significant octets, thus removing the fourth one, which always the entity kind, and will have almost always the same value. Right now, the only place where this hash function is used is on unordered_map structures that keep the list of readers and the list of writers of the same participant. As the participant creates entities by incrementing a counter that is assigned to the 3 first octets of the entityId, this PR ensures uniform distributions on those structures. |
Is our responsibility to select the most suitable number of buckets for the hash tables. Improvements into static allocations using foonathan allocated unordered maps have not yet been introduced in 1.9.x, thus it's not clear, but when a number of expected endpoints is provided the hash tables constructors in use rehash the table to a proper number of buckets. |
Internally STL implementation of hash tables (unordered_map and set) use modulus operation to choose in which bucket put each element. Using an uniform distributed hash guarantees that the buckets get evenly filled and access time is minimum.
Until now we were using the whole EntityId_t interpret as a number as hash. That led to a non-uniform, unsuitable distribution because within each EntityId_t are kept flags not merely a counter. Now, only the counter value is used solving the speed issue.