Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String Dictionary with StringComparer.Ordinal slower than default EqualityComparer #29714

Closed
fdbva opened this issue May 30, 2019 · 8 comments
Closed
Labels
area-System.Collections backlog-cleanup-candidate An inactive issue that has been marked for automated closure.
Milestone

Comments

@fdbva
Copy link

fdbva commented May 30, 2019

Hello.

As I understand it, a Dictionary with it's key as string and using StringComparer.Ordinal should be faster than with a default EqualityComparer.
I did some tests with BenchmarkDotNet and it seemed true for .NET Framework but not for .NET CORE.

Maybe I'm doing something wrong?
Bellow are the results, the code for the tests and the csproj file.

BenchmarkDotNet=v0.11.5, OS=Windows 10.0.17134.799 (1803/April2018Update/Redstone4)
Intel Core i7-8550U CPU 1.80GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
Frequency=1945316 Hz, Resolution=514.0553 ns, Timer=TSC
.NET Core SDK=3.0.100-preview5-011568
  [Host]     : .NET Core 2.2.2 (CoreCLR 4.6.27317.07, CoreFX 4.6.27318.02), 64bit RyuJIT
  Job-FLLDVS : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.8.3801.0
  Job-YILBVB : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.8.3801.0
  Job-PNTBHM : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.8.3801.0
  Job-CZKISL : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.8.3801.0
  Job-LUFVLI : .NET Core 2.2.2 (CoreCLR 4.6.27317.07, CoreFX 4.6.27318.02), 64bit RyuJIT
  Job-HEHRAX : .NET Core 3.0.0-preview5-27626-15 (CoreCLR 4.6.27622.75, CoreFX 4.700.19.22408), 64bit RyuJIT
  Job-GDGOTQ : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.8.3801.0
  Job-DDPHUA : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.8.3801.0

|                  Method |       Jit | Platform |     Toolchain |     Mean |     Error |    StdDev |
|------------------------ |---------- |--------- |-------------- |---------:|----------:|----------:|
| DefaultEqualityComparer | LegacyJit |      X64 |        net461 | 21.75 ns | 0.4868 ns | 0.5606 ns |
| OrdinalEqualityComparer | LegacyJit |      X64 |        net461 | 19.08 ns | 0.4659 ns | 0.4985 ns |
| DefaultEqualityComparer | LegacyJit |      X64 |        net472 | 21.50 ns | 0.4958 ns | 0.6089 ns |
| OrdinalEqualityComparer | LegacyJit |      X64 |        net472 | 19.34 ns | 0.4518 ns | 0.6031 ns |
| DefaultEqualityComparer | LegacyJit |      X86 |        net461 | 26.03 ns | 0.5878 ns | 0.8240 ns |
| OrdinalEqualityComparer | LegacyJit |      X86 |        net461 | 23.79 ns | 0.5274 ns | 0.5416 ns |
| DefaultEqualityComparer | LegacyJit |      X86 |        net472 | 25.89 ns | 0.3940 ns | 0.3492 ns |
| OrdinalEqualityComparer | LegacyJit |      X86 |        net472 | 23.22 ns | 0.5124 ns | 0.4278 ns |
| DefaultEqualityComparer |    RyuJit |      X64 | .NET Core 2.2 | 16.27 ns | 0.4007 ns | 0.3552 ns |
| OrdinalEqualityComparer |    RyuJit |      X64 | .NET Core 2.2 | 21.80 ns | 0.4991 ns | 0.5126 ns |
| DefaultEqualityComparer |    RyuJit |      X64 | .NET Core 3.0 | 13.67 ns | 0.3466 ns | 0.4256 ns |
| OrdinalEqualityComparer |    RyuJit |      X64 | .NET Core 3.0 | 18.12 ns | 0.4199 ns | 0.3928 ns |
| DefaultEqualityComparer |    RyuJit |      X64 |        net461 | 21.44 ns | 0.5019 ns | 0.5370 ns |
| OrdinalEqualityComparer |    RyuJit |      X64 |        net461 | 19.32 ns | 0.4559 ns | 0.7863 ns |
| DefaultEqualityComparer |    RyuJit |      X64 |        net472 | 21.61 ns | 0.4884 ns | 0.5226 ns |
| OrdinalEqualityComparer |    RyuJit |      X64 |        net472 | 19.11 ns | 0.4370 ns | 0.5834 ns |
| DefaultEqualityComparer |    RyuJit |      X86 | .NET Core 2.2 |       NA |        NA |        NA |
| OrdinalEqualityComparer |    RyuJit |      X86 | .NET Core 2.2 |       NA |        NA |        NA |
| DefaultEqualityComparer |    RyuJit |      X86 | .NET Core 3.0 |       NA |        NA |        NA |
| OrdinalEqualityComparer |    RyuJit |      X86 | .NET Core 3.0 |       NA |        NA |        NA |
using System;
using System.Collections.Generic;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Environments;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
using BenchmarkDotNet.Toolchains.CsProj;

namespace DictionaryOrdinalVsDefaultComparer
{
    public class Program
    {
        [Config(typeof(MultipleRuntimes))]
        public class OrdinalVsDefaultComparer
        {
            private readonly Dictionary<string, string> _defaultDictionary =
                new Dictionary<string, string> { { "should be", "slower" } };
            private readonly Dictionary<string, string> _ordinalDictionary =
                new Dictionary<string, string>(StringComparer.Ordinal) { { "should be", "faster" } };

            [Benchmark]
            public string DefaultEqualityComparer()
            {
                _defaultDictionary.TryGetValue("should be", out var defaultComparerValue);
                return defaultComparerValue;
            }

            [Benchmark]
            public string OrdinalEqualityComparer()
            {
                _ordinalDictionary.TryGetValue("should be", out var ordinalComparerValue);
                return ordinalComparerValue;
            }
        }
        public static void Main(string[] args)
        {
            var summary = BenchmarkRunner.Run<OrdinalVsDefaultComparer>();
        }
        public class MultipleRuntimes : ManualConfig
        {
            public MultipleRuntimes()
            {
                Add(Job.Default
                    .With(CsProjCoreToolchain.NetCoreApp22)
                    .With(Jit.RyuJit)
                    .With(Platform.X64));
                Add(Job.Default
                    .With(CsProjCoreToolchain.NetCoreApp22)
                    .With(Jit.RyuJit)
                    .With(Platform.X86));
                Add(Job.Default
                    .With(CsProjCoreToolchain.NetCoreApp30)
                    .With(Jit.RyuJit)
                    .With(Platform.X64));
                Add(Job.Default
                    .With(CsProjCoreToolchain.NetCoreApp30)
                    .With(Jit.RyuJit)
                    .With(Platform.X86));

                Add(Job.Default
                    .With(CsProjCoreToolchain.NetCoreApp22)
                    .With(Jit.LegacyJit)
                    .With(Platform.X64));
                Add(Job.Default
                    .With(CsProjCoreToolchain.NetCoreApp22)
                    .With(Jit.LegacyJit)
                    .With(Platform.X86));
                Add(Job.Default
                    .With(CsProjCoreToolchain.NetCoreApp30)
                    .With(Jit.LegacyJit)
                    .With(Platform.X64));
                Add(Job.Default
                    .With(CsProjCoreToolchain.NetCoreApp30)
                    .With(Jit.LegacyJit)
                    .With(Platform.X86));

                Add(Job.Default
                    .With(CsProjClassicNetToolchain.Net461)
                    .With(Jit.LegacyJit)
                    .With(Platform.X64));
                Add(Job.Default
                    .With(CsProjClassicNetToolchain.Net461)
                    .With(Jit.RyuJit)
                    .With(Platform.X64));
                Add(Job.Default
                    .With(CsProjClassicNetToolchain.Net461)
                    .With(Jit.LegacyJit)
                    .With(Platform.X86));
                Add(Job.Default
                    .With(CsProjClassicNetToolchain.Net472)
                    .With(Jit.LegacyJit)
                    .With(Platform.X64));
                Add(Job.Default
                    .With(CsProjClassicNetToolchain.Net472)
                    .With(Jit.RyuJit)
                    .With(Platform.X64));
                Add(Job.Default
                    .With(CsProjClassicNetToolchain.Net472)
                    .With(Jit.LegacyJit)
                    .With(Platform.X86));
            }
        }
    }
}

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFrameworks>netcoreapp2.2;netcoreapp3.0;net461;net472;</TargetFrameworks>
    <PlatformTarget>AnyCPU</PlatformTarget>
    <RuntimeIdentifiers>win10-x64;win10-x86</RuntimeIdentifiers>
    <LangVersion>latest</LangVersion>
  </PropertyGroup>

  <PropertyGroup Condition="'$(Configuration)|$(TargetFramework)|$(Platform)'=='Release|netcoreapp2.2|AnyCPU'">
    <DefineConstants />
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="BenchmarkDotNet" Version="0.11.5" />
  </ItemGroup>

</Project>

@AndyAyersMS
Copy link
Member

There are a number of optimizations to the dictionary and the handling of the default comparers by the jit -- for example, dotnet/coreclr#14125, dotnet/coreclr#15419). So not too surprising that this is now the fastest method.

I have not looked at the ordinal comparer perf; perhaps it would be worth investigating.

@stephentoub
Copy link
Member

As I understand it, a Dictionary with it's key as string and using StringComparer.Ordinal should be faster than with a default EqualityComparer.

Why would you expect it to be faster?

@fdbva
Copy link
Author

fdbva commented May 30, 2019

As I understand it, a Dictionary with it's key as string and using StringComparer.Ordinal should be faster than with a default EqualityComparer.

Why would you expect it to be faster?

I think mostly because of Microsoft documentation about strings and their article about best practices docs best practices
Also, benchmark data enforces that, but I understand it may not be a valuable base for .NET Core.

But the Dictionary<string, object> lookup in the system here is heavily stressed, and most of its usages we pass to it's constructor the StringComparer.Ordinal.
I always use ordinal comparison when I can, considering performance. But this was my first time running BenchmarkDotNet, so I was a bit surprised. I could swear seeing better results after analysing a running application timings, but I might've done something wrong.

@stephentoub
Copy link
Member

EqualityComparer<string>.Default might have a different concrete type than StringComparer.Ordinal, but it has the same semantics in that the default equality comparer for strings is ordinal case-sensitive, just like StringComparer.Ordinal. In .NET Core, for safety reasons, both use randomized hashing, which helps avoid collision attacks. But when a Dictionary<> is constructed without a comparer or with EqualityComparer<string>.Default, it can play tricks to avoid that randomization until it detects a problem, it can take fast paths that enable better inlining and devirtualization, etc.

@svick
Copy link
Contributor

svick commented May 31, 2019

@stephentoub What if Dictionary<T> used the same tricks for StringComparer.Ordinal?

The Comparer property has to behave differently for the two comparers in question, but I think that could be done by turning NonRandomizedStringEqualityComparer from singleton to "doubleton".

This would add one reference equality check to the constructor and one field access to Comparer, but those costs sound acceptable to me. (And some other costs that should be negligible.)

Or would this have issues with inlining or devirtualization that I'm missing?

The code diff would roughly look like this:

@@ -84,6 +84,10 @@ public Dictionary(int capacity, IEqualityComparer<TKey>? comparer)
                 // To start, move off default comparer for string which is randomised
                 _comparer = (IEqualityComparer<TKey>)NonRandomizedStringEqualityComparer.Default;
             }
+            else if (typeof(TKey) == typeof(string) && _comparer == StringComparer.Ordinal)
+            {
+                _comparer = (IEqualityComparer<TKey>)NonRandomizedStringEqualityComparer.Ordinal;
+            }
         }

         public Dictionary(IDictionary<TKey, TValue> dictionary) : this(dictionary, null) { }
@@ -149,7 +153,11 @@ public IEqualityComparer<TKey> Comparer
         {
             get
             {
-                return (_comparer == null || _comparer is NonRandomizedStringEqualityComparer) ? EqualityComparer<TKey>.Default : _comparer;
+                return _comparer switch {
+                    null => EqualityComparer<TKey>.Default,
+                    NonRandomizedStringEqualityComparer nr => (IEqualityComparer<TKey>)nr.BackingComparer,
+                    _ => _comparer
+                };
             }
         }

@@ -662,11 +670,11 @@ private bool TryInsert(TKey key, TValue value, InsertionBehavior behavior)
             _version++;

             // Value types never rehash
-            if (default(TKey)! == null && collisionCount > HashHelpers.HashCollisionThreshold && comparer is NonRandomizedStringEqualityComparer) // TODO-NULLABLE: default(T) == null warning (https://github.com/dotnet/roslyn/issues/34757)
+            if (default(TKey)! == null && collisionCount > HashHelpers.HashCollisionThreshold && comparer is NonRandomizedStringEqualityComparer nr) // TODO-NULLABLE: default(T) == null warning (https://github.com/dotnet/roslyn/issues/34757)
             {
                 // If we hit the collision threshold we'll need to switch to the comparer which is using randomized string hashing
                 // i.e. EqualityComparer<string>.Default.
-                _comparer = null;
+                _comparer = (IEqualityComparer<TKey>)(nr.BackingComparer == EqualityComparer<string>.Default ? null : nr.BackingComparer);
                 Resize(entries.Length, true);
             }

@stephentoub
Copy link
Member

What if Dictionary used the same tricks for StringComparer.Ordinal?

If it can be done without breaking anything, without measurable overheads, and with measurable improvements, sure.

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the Future milestone Feb 1, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@layomia layomia removed the untriaged New issue has not been triaged by the area owner label Jun 24, 2020
@ghost
Copy link

ghost commented Oct 26, 2021

Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process.

This process is part of the experimental issue cleanup initiative we are currently trialing in a limited number of areas. Please share any feedback you might have in the linked issue.

@ghost
Copy link

ghost commented Nov 9, 2021

This issue will now be closed since it had been marked no recent activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the issue, but please note that the issue will be locked if it remains inactive for another 30 days.

@ghost ghost closed this as completed Nov 9, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2021
@eiriktsarpalis eiriktsarpalis added the backlog-cleanup-candidate An inactive issue that has been marked for automated closure. label Feb 18, 2022
@ghost ghost removed the no-recent-activity label Feb 18, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Collections backlog-cleanup-candidate An inactive issue that has been marked for automated closure.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants