Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize InternableString.GetHashCode #6816

Merged
merged 7 commits into from
Sep 14, 2021
Merged

Conversation

ladipro
Copy link
Member

@ladipro ladipro commented Sep 3, 2021

Context

The current straightforward implementation is slow, especially on 64-bit Framework CLR. It is causing a measurable regression when evaluating large projects.

Changes Made

Rewrote the routine using a similar approach as this BCL method. Calculating only 2 and not 4 characters at a time, though, to reduce the complexity since our implementation works on a list of spans and not just one string. The additional perf benefit of going 4 at a time would be relatively small.

Testing

  • A new unit test to verify correctness.
  • Micro-benchmark showing 1.6x boost on x86 and 2x boost on x64.
  • Evaluation perf traces showing significant reduction in the time spent in GetHashCode, saving 110 ms (~6%) per evaluation of the Unreal Engine C++ project on 64-bit.

@@ -304,28 +304,59 @@ public override string ToString()
/// <returns>A stable hashcode of the string represented by this instance.</returns>
public override unsafe int GetHashCode()
{
int hashCode = 5381;
uint hash = (5381 << 16) + 5381;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly off-topic: Wouldn't caching this string's hashcode result in a net improvement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetHashCode runs only once for a given string unless the caller calls SpanBasedStringBuilder.ToString() multiple times on the same instance without mutating it between the calls. Same as calling StringBuilder.ToString() multiple times, it is technically possible but enough of an anti-pattern that the implementation does not cache the result.

More on running GetHashCode only once: When a string is added to the weak cache, its hash code is used as a key in a dictionary. So it's not calculated on each look-up because the look-up is done based on the hash code and not the string itself. When we're looking for a string in the weak cache, we calculate its hash code once per lookup and, as argued above, there should not be more than one look-up for the same string.

Copy link
Member

@rainersigwald rainersigwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me but since hashing is scary just wanted to check on the algo and constants and things.

I also note in the linked impl that it's "when collisions aren't a problem", which . . . seems right for us but have you thought about that in depth?

src/StringTools/InternableString.cs Show resolved Hide resolved
@ladipro
Copy link
Member Author

ladipro commented Sep 7, 2021

Looks reasonable to me but since hashing is scary just wanted to check on the algo and constants and things.

Indeed, I have a major bug there because << is a bit shift and I need bit rotation 🤦‍♂️

The plan:

  • Fix the code to use BitOperations.RotateLeft.
  • Re-run the benchmark.
  • Fix the unit tests which explicitly uses colliding strings.
  • Check for collisions on a representative set of strings as a sanity check.

I also note in the linked impl that it's "when collisions aren't a problem", which . . . seems right for us but have you thought about that in depth?

This was discussed when the current implementation was being developed and we convinced ourselves that since you can ask MSBuild to wipe the drive or do anything the current user can do, we don't really worry about DoS'ing it with crafted strings. In other words, strings processed by MSBuild are not considered user input that we should be protecting against because the whole build process has to be trusted by design.

@Forgind Forgind added the merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now. label Sep 13, 2021
@ladipro
Copy link
Member Author

ladipro commented Sep 13, 2021

The perf win on my VM is now lower - was 4.5x, now only 2x. I see the same thing without the new commits so it's likely because of the particular physical CPU where my machine is hosted is different. I have updated the description with the new numbers.

@AR-May AR-May merged commit 4ceb3f8 into dotnet:main Sep 14, 2021
/// <param name="hashedOddNumberOfCharacters">True if the incoming <paramref name="hash"/> was calculated from an odd number of characters.</param>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static unsafe void GetHashCodeHelper(char* charPtr, int length, ref uint hash, ref bool hashedOddNumberOfCharacters)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using ref variables in tight loops. JIT cant optimize it into registry.
I recommend to change signature to private static unsafe uint GetHashCodeHelper(char* charPtr, int length, uint hash, ref bool hashedOddNumberOfCharacters) and call it hash = GetHashCodeHelper(charPtr, span.Length, hash, ref hashedOddNumberOfCharacters);

In my micro benchmark, this simple change makes it about 2x faster.

Copy link
Member Author

@ladipro ladipro Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this must be the reason why it got slower after the last update. Confirming your results, it really is more than 2x faster after eliminating the ref parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it also be faster using out parameters? Having an input and out parameter that happen to match. Or maybe returning a tuple?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Forgind Using non ref local variable as running hash in tight loop and than copy it to out parameter would most probably render about same benefit. However, returning integer value from procedure is something highly optimized by calling conventions. In particular it returns value in registry eax. This is significantly faster than exchanging return values by copying it into stack memory which both value tuple and out variables does. By significant I mean about 1 us slower for modern CPUs, so in practical world it rarely matters.

src/StringTools/InternableString.cs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants