JIT: Improve x86 unsigned to floating cast codegen #111595

saucecontrol · 2025-01-19T19:46:13Z

This improves codegen mostly for unsigned to floating types but catches a few other redundant conversions.

Adds support for using AVX-512 vcvtusi2s[sd] for uint -> float/double (only ulong was handled previously) on both x64 and x86.

-       mov      eax, edx
        vxorps   xmm0, xmm0, xmm0
-       vcvtsi2sd xmm0, xmm0, rax
-       vcvtsd2ss xmm0, xmm0, xmm0
+       vcvtusi2ss xmm0, xmm0, edx

-       mov      eax, dword ptr [rbp-0x04]
-       mov      eax, eax
        vxorps   xmm0, xmm0, xmm0
-       vcvtsi2sd xmm0, xmm0, rax
+       vcvtusi2sd xmm0, xmm0, dword ptr [rbp-0x04]

-       push     0
-       push     eax
-       call     CORINFO_HELP_LNG2DBL
-       fstp     qword ptr [ebp-0x10]
-       vmovsd   xmm0, qword ptr [ebp-0x10]
-       vcvtsd2ss xmm0, xmm0, xmm0
+       vxorps   xmm0, xmm0, xmm0
+       vcvtusi2ss xmm0, xmm0, eax

Improves codegen for uint -> float conversions on x64 without AVX-512, removing the intermediate conversion to double.

        mov      eax, edi
        xorps    xmm0, xmm0
-       cvtsi2sd xmm0, rax
-       cvtsd2ss xmm0, xmm0
+       cvtsi2ss xmm0, rax

Adds support for direct ulong -> float cast to the x64 SSE2 fallback, resolving a difference in behavior between hardware with AVX-512 vs without, and saving an extra cvtsd2ss instruction.

        xorps    xmm0, xmm0
        mov      rax, rdi
        shr      rax, 1
        mov      rsi, edi
        and      rsi, 1
        or       rsi, rax
        test     rdi, rdi
        cmovns   rsi, rdi
-       cvtsi2sd xmm0, rsi
+       cvtsi2ss xmm0, rsi
        jns      SHORT G_M37561_IG56
-       addsd    xmm0, xmm0
+       addss    xmm0, xmm0
 G_M37561_IG56:
-       cvtsd2ss xmm0, xmm0

Removes some redundant float -> double -> float casts.

        vmulss   xmm1, xmm1, dword ptr [@RWD00]
-       vcvtss2sd xmm1, xmm1, xmm1
-       vcvtsd2ss xmm1, xmm1, xmm1
        vbroadcastss xmm1, xmm1

SPMI Diffs

The only code size regressions are the insertion of xorps to clear the upper elements of the target reg for the AVX-512 unsigned conversion instructions. These were previously omitted but should have been there since the unsigned conversions have the same behavior as the signed (i.e. preserving/copying upper elements) and are subject to the same false dependency penalties.

GCC emits the xorps for all conversions; Clang skips it for all conversions in simple examples but may emit it in more complex scenarios.
https://godbolt.org/z/6aY7fdE3d

saucecontrol · 2025-01-19T20:06:57Z

@MihuBot

saucecontrol · 2025-01-19T22:37:45Z

cc @dotnet/jit-contrib this is ready for review

src/coreclr/jit/codegenxarch.cpp

tannergooding

CC. @dotnet/jit-contrib for secondary review

* main: System.Net.Http.WinHttpHandler.StartRequestAsync assertion failed (dotnet#109799) Keep test PDB in helix payload for native AOT (dotnet#111949) Build the RID-specific System.IO.Ports packages in the VMR (dotnet#112054) Always inline number conversions (dotnet#112061) Use Contains{Any} in Regex source generator (dotnet#112065) Update dependencies from https://github.com/dotnet/arcade build 20250130.5 (dotnet#112013) JIT: Transform single-reg args to FIELD_LIST in physical promotion (dotnet#111590) Ensure that math calls into the CRT are tracked as needing vzeroupper (dotnet#112011) Use double.ConvertToIntegerNative where safe to do in System.Random (dotnet#112046) JIT: Compute `fgCalledCount` after synthesis (dotnet#112041) Simplify boolean logic in `TimeZoneInfo` (dotnet#112062) JIT: Update type when return temp is freshly created (dotnet#111948) Remove unused build controls and simplify DotNetBuild.props (dotnet#111986) Fix case-insensitive JSON deserialization of enum member names (dotnet#112028) WasmAppBuilder: Remove double computation of a value (dotnet#112047) Disable LTCG for brotli and zlibng. (dotnet#111805) JIT: Improve x86 unsigned to floating cast codegen (dotnet#111595) simplify x86 special intrinsic imports (dotnet#111836) JIT: Try to retain entry weight during profile synthesis (dotnet#111971) Fix explicit offset of ByRefLike fields. (dotnet#111584)

amanasifkhalid · 2025-02-10T22:34:43Z

@saucecontrol we're seeing some test failures (#112324, #112325, #112329) in our stress pipelines on x64 related to floating-point casts. Do those look related to this PR?

saucecontrol · 2025-02-10T22:55:02Z

Yeah, if it's only showing up under JitStressRegs, it's likely bad codegen from an existing bug this PR exposed by changing float->double->float casts to float->float. In which case, #112217 should resolve it.

saucecontrol · 2025-02-10T22:58:14Z

Actually, not so sure about that last one. This PR changed integral->floating casts, but those failures are on floating->integral, which I don't think has changed recently

amanasifkhalid · 2025-02-10T23:01:50Z

Actually, not so sure about that last one. This PR changed integral->floating casts, but those failures are on floating->integral, which I don't think has changed recently

Thanks for confirming; I'll wait to see if #112217 resolves them...

improve x86 integral to floating cast codegen

496e50f

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 19, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 19, 2025

MihuBot mentioned this pull request Jan 19, 2025

[JitDiff X64] [saucecontrol] JIT: Improve x86 integral to floating cast codegen MihuBot/runtime-utils#910

Open

saucecontrol marked this pull request as ready for review January 19, 2025 22:17

saucecontrol changed the title ~~JIT: Improve x86 integral to floating cast codegen~~ JIT: Improve x86 unsigned to floating cast codegen Jan 19, 2025

This was referenced Jan 20, 2025

slow macOS - "##[error]The job running on agent Azure Pipelines 9 ran longer than the maximum time of 60 minutes." dotnet/dnceng#1883

Open

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

saucecontrol added 2 commits January 22, 2025 16:38

Merge remote-tracking branch 'upstream/main' into unsigned-float-cast

85b37fa

more cleanup

77b15f0

build-analysis bot mentioned this pull request Jan 23, 2025

Intermittent build failure in AfterSourceBuild: "Could not write state file" #76488

Open

Merge branch 'main' into unsigned-float-cast

8ec4d46

tannergooding reviewed Jan 24, 2025

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Show resolved Hide resolved

tannergooding approved these changes Jan 24, 2025

View reviewed changes

more cleanup

4724cf6

build-analysis bot mentioned this pull request Jan 28, 2025

"We stopped hearing from agent Azure Pipelines 32. Verify the agent machine is running and has a healthy network connection" dotnet/dnceng#1886

Open

3 tasks

BruceForstall approved these changes Jan 31, 2025

View reviewed changes

BruceForstall merged commit fa0f65c into dotnet:main Jan 31, 2025
118 checks passed

saucecontrol deleted the unsigned-float-cast branch January 31, 2025 19:15

saucecontrol mentioned this pull request Feb 6, 2025

JIT: Illegal instruction at JitTest_chain_boxunbox_il.Test.Main() under DOTNET_JitStressRegs=0x2000 #112163

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Improve x86 unsigned to floating cast codegen #111595

JIT: Improve x86 unsigned to floating cast codegen #111595

saucecontrol commented Jan 19, 2025 •

edited

Loading

saucecontrol commented Jan 19, 2025

saucecontrol commented Jan 19, 2025

tannergooding left a comment

amanasifkhalid commented Feb 10, 2025 •

edited

Loading

saucecontrol commented Feb 10, 2025

saucecontrol commented Feb 10, 2025

amanasifkhalid commented Feb 10, 2025

JIT: Improve x86 unsigned to floating cast codegen #111595

JIT: Improve x86 unsigned to floating cast codegen #111595

Conversation

saucecontrol commented Jan 19, 2025 • edited Loading

saucecontrol commented Jan 19, 2025

saucecontrol commented Jan 19, 2025

tannergooding left a comment

Choose a reason for hiding this comment

amanasifkhalid commented Feb 10, 2025 • edited Loading

saucecontrol commented Feb 10, 2025

saucecontrol commented Feb 10, 2025

amanasifkhalid commented Feb 10, 2025

saucecontrol commented Jan 19, 2025 •

edited

Loading

amanasifkhalid commented Feb 10, 2025 •

edited

Loading