Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suboptimal code generation for == operator for nullable primitive types #41108

Open
zlatanov opened this issue Jan 21, 2020 · 1 comment
Open

Comments

@zlatanov
Copy link
Contributor

zlatanov commented Jan 21, 2020

Version Used: 16.4.2

If we write the following code:

static bool M(int? x, int? y) => x == y;

The compiler generates IL that introduces two locals.

.maxstack 3
.locals init (
	[0] valuetype [System.Private.CoreLib]System.Nullable`1<int32>,
	[1] valuetype [System.Private.CoreLib]System.Nullable`1<int32>
)

IL_0000: ldarg.0
IL_0001: stloc.0
IL_0002: ldarg.1
IL_0003: stloc.1
IL_0004: ldloca.s 0
IL_0006: call instance !0 valuetype [System.Private.CoreLib]System.Nullable`1<int32>::GetValueOrDefault()
IL_000b: ldloca.s 1
IL_000d: call instance !0 valuetype [System.Private.CoreLib]System.Nullable`1<int32>::GetValueOrDefault()
IL_0012: ceq
IL_0014: ldloca.s 0
IL_0016: call instance bool valuetype [System.Private.CoreLib]System.Nullable`1<int32>::get_HasValue()
IL_001b: ldloca.s 1
IL_001d: call instance bool valuetype [System.Private.CoreLib]System.Nullable`1<int32>::get_HasValue()
IL_0022: ceq
IL_0024: and
IL_0025: ret

If we write the code as:

static bool M(int? x, int? y) => x.HasValue == y.HasValue 
    && x.GetValueOrDefault() == y.GetValueOrDefault();
.maxstack 8

IL_0000: ldarga.s x
IL_0002: call instance bool valuetype [System.Private.CoreLib]System.Nullable`1<int32>::get_HasValue()
IL_0007: ldarga.s y
IL_0009: call instance bool valuetype [System.Private.CoreLib]System.Nullable`1<int32>::get_HasValue()
IL_000e: bne.un.s IL_0021

IL_0010: ldarga.s x
IL_0012: call instance !0 valuetype [System.Private.CoreLib]System.Nullable`1<int32>::GetValueOrDefault()
IL_0017: ldarga.s y
IL_0019: call instance !0 valuetype [System.Private.CoreLib]System.Nullable`1<int32>::GetValueOrDefault()
IL_001e: ceq
IL_0020: ret

IL_0021: ldc.i4.0
IL_0022: ret

then the compiler generates code that has no locals and has 3 IL instructions less. The JITed code difference is bigger - the first version is 18 instructions and the second 13.

Here is a repro in sharplab.

Even microbenchmarking the code shows that the second version is better:

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18363
Intel Core i7-4790K CPU 4.00GHz (Haswell), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.100
  [Host]     : .NET Core 3.1.0 (CoreCLR 4.700.19.56402, CoreFX 4.700.19.56404), X64 RyuJIT
  DefaultJob : .NET Core 3.1.0 (CoreCLR 4.700.19.56402, CoreFX 4.700.19.56404), X64 RyuJIT
Method x y Mean Error StdDev Ratio
Default 1 2 2.758 ns 0.0218 ns 0.0193 ns 1.00
Custom 1 2 2.486 ns 0.0281 ns 0.0263 ns 0.90

EDIT:

This is what the code looks like when the values being compared are not arguments: here.

    public static bool M()
    {
        var x = Random();
        var y = Random();
        
        return x == y;
    }
    
    
    [MethodImpl( MethodImplOptions.NoInlining )]
    static int? Random() => 42;
.maxstack 3
.locals init (
	[0] valuetype [System.Private.CoreLib]System.Nullable`1<int32>,
	[1] valuetype [System.Private.CoreLib]System.Nullable`1<int32>,
	[2] valuetype [System.Private.CoreLib]System.Nullable`1<int32>
)

IL_0000: call valuetype [System.Private.CoreLib]System.Nullable`1<int32> C::Random()
IL_0005: stloc.0
IL_0006: call valuetype [System.Private.CoreLib]System.Nullable`1<int32> C::Random()
IL_000b: ldloc.0
IL_000c: stloc.1
IL_000d: stloc.2
IL_000e: ldloca.s 1
IL_0010: call instance !0 valuetype [System.Private.CoreLib]System.Nullable`1<int32>::GetValueOrDefault()
IL_0015: ldloca.s 2
IL_0017: call instance !0 valuetype [System.Private.CoreLib]System.Nullable`1<int32>::GetValueOrDefault()
IL_001c: ceq
IL_001e: ldloca.s 1
IL_0020: call instance bool valuetype [System.Private.CoreLib]System.Nullable`1<int32>::get_HasValue()
IL_0025: ldloca.s 2
IL_0027: call instance bool valuetype [System.Private.CoreLib]System.Nullable`1<int32>::get_HasValue()
IL_002c: ceq
IL_002e: and
IL_002f: ret

This time the code generated is almost the same and the performance is the same. One thing I don't understand is why in this case the compiler generates 3 locals? Both GetValueOrDefault and HasValue are readonly methods and will not mutate the objects?

@tannergooding
Copy link
Member

GetValueOrDefault and HasValue were only recently (6 days ago: dotnet/runtime#1727) made readonly, so this might be a point in time issue and it would be worth checking against the latest dotnet/runtime nightly build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants