Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Delegate GDV check is not being hoisted outside the loop #109379

Closed
hez2010 opened this issue Oct 30, 2024 · 4 comments · Fixed by #109407
Closed

JIT: Delegate GDV check is not being hoisted outside the loop #109379

hez2010 opened this issue Oct 30, 2024 · 4 comments · Fixed by #109407
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Milestone

Comments

@hez2010
Copy link
Contributor

hez2010 commented Oct 30, 2024

Description

Repro:

AggregateDelegate(Enumerable.Range(0, 100_000).ToArray(), (acc, v) => acc + v, 0);

[MethodImpl(MethodImplOptions.NoInlining)]
public static T AggregateDelegate<T>(T[] ar, Func<T, T, T> func, T seed)
{
    for (int i = 0; i < ar.Length; i++)
    {
        seed = func(seed, ar[i]);
    }
    return seed;
}

Codegen on .NET 9:

; Assembly listing for method MyLinq:AggregateDelegate[int](int[],System.Func`3[int,int,int],int):int (Tier1)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier1 code
; optimized code
; optimized using Dynamic PGO
; rsp based frame
; fully interruptible
; with Dynamic PGO: fgCalledCount is 100
; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data
G_M000_IG01:                ;; offset=0x0000
       push     rdi
       push     rsi
       push     rbx
       sub      rsp, 32
       mov      rbx, rdx

G_M000_IG02:                ;; offset=0x000A
       mov      esi, dword ptr [rcx+0x08]
       test     esi, esi
       jle      SHORT G_M000_IG06

G_M000_IG03:                ;; offset=0x0011
       lea      rdi, bword ptr [rcx+0x10]

G_M000_IG04:                ;; offset=0x0015
       mov      ecx, dword ptr [rdi]
       mov      rax, 0x7FFB9A950918
       cmp      qword ptr [rbx+0x18], rax ; <-- delegate GDV check, should be hoisted outside the loop
       jne      SHORT G_M000_IG08
       add      r8d, ecx

G_M000_IG05:                ;; offset=0x002A
       add      rdi, 4
       dec      esi
       jne      SHORT G_M000_IG04

G_M000_IG06:                ;; offset=0x0032
       mov      eax, r8d

G_M000_IG07:                ;; offset=0x0035
       add      rsp, 32
       pop      rbx
       pop      rsi
       pop      rdi
       ret

G_M000_IG08:                ;; offset=0x003D
       mov      edx, r8d
       mov      r8d, ecx
       mov      rcx, gword ptr [rbx+0x08]
       call     [rbx+0x18]System.Func`3[int,int,int]:Invoke(int,int):int:this
       mov      r8d, eax
       jmp      SHORT G_M000_IG05

Configuration

.NET 9 RC2

Regression?

No

@hez2010 hez2010 added the tenet-performance Performance related issue label Oct 30, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 30, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Oct 30, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@AndyAyersMS
Copy link
Member

@jakobbotsch FYI

@jakobbotsch jakobbotsch self-assigned this Oct 31, 2024
@jakobbotsch jakobbotsch added this to the 10.0.0 milestone Oct 31, 2024
@jakobbotsch jakobbotsch removed the untriaged New issue has not been triaged by the area owner label Oct 31, 2024
@jakobbotsch
Copy link
Member

Seems like we get pretty far..

*************** In optCloneLoops()
Considering loop L00 to clone for optimizations.
Analyzing iteration for L00 with header BB03
  Preheader = BB02
  Checking exiting block BB06
  Init = [000001], test = [000006], incr = [000019]
  Init block BB01 enters the loop when condition [000062] evaluates to false
    op2 is the iteration variable
  Condition is established before entry at [000062]
  IterVar = V03
  Const init with value 0 (at [000001])
  Test is [000005] (array length limit )
Checking loop L00 for optimization candidates (array bounds) (GDV tests)
Found ArrIndex at BB03 STMT00006 tree [000059] which is equivalent to: V00[V03], bounds check nodes: [000059]
Loop L00 can be cloned for ArrIndex V00[V03] on dim 0
...GDV considering [000031]
... right form for method address test with local V01
Loop L00 has invariant method address test [000031] on V01
Checking whether cloning is profitable ...
  Yes
...GDV considering [000006]
------------------------------------------------------------

------------------------------------------------------------
Deriving cloning conditions for L00
Unknown opt
> Conditions could not be obtained
Cancelling loop cloning for loop L00

Might just be an actual logic bug.

@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Oct 31, 2024
@jakobbotsch
Copy link
Member

Indeed, the fix is really simple. Thanks a lot for reporting this issue @hez2010!

@github-actions github-actions bot locked and limited conversation to collaborators Dec 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants