Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loops aren't cloned for Spans #82946

Open
Tracked by #109677
EgorBo opened this issue Mar 3, 2023 · 7 comments
Open
Tracked by #109677

Loops aren't cloned for Spans #82946

EgorBo opened this issue Mar 3, 2023 · 7 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Mar 3, 2023

void Test(Span<byte> source)
{
    for (int i = 0; i < 100; i++)
        source[i] = 0;
}

Codegen:

; Method Progr:Test(System.Span`1[ubyte]):this
G_M11420_IG01:              
       4883EC28             sub      rsp, 40
G_M11420_IG02:              
       33C0                 xor      eax, eax
                            align    [0 bytes for IG03]
G_M11420_IG03:              
       3B4208               cmp      eax, dword ptr [rdx+08H]
       7317                 jae      SHORT G_M11420_IG05
       488B0A               mov      rcx, bword ptr [rdx]
       448BC0               mov      r8d, eax
       42C6040100           mov      byte  ptr [rcx+r8], 0
       FFC0                 inc      eax
       83F864               cmp      eax, 100
       7CE9                 jl       SHORT G_M11420_IG03
G_M11420_IG04:              
       4883C428             add      rsp, 40
       C3                   ret      
G_M11420_IG05:              
       E829C64C5F           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3     
; Total bytes of code: 40
*************** Starting PHASE Clone loops

*************** In optCloneLoops()
Considering loop L00 to clone for optimizations.
Not checking loop L00 -- no array bounds or type tests in this method
------------------------------------------------------------

  No clonable loops

*************** Finishing PHASE Clone loops

Here it is expected to have two loops and one of them without bound checks.

cc @BruceForstall @dotnet/jit-contrib

category:cq
theme:loop-opt
skill-level:expert
cost:medium
impact:large

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 3, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 3, 2023
@ghost
Copy link

ghost commented Mar 3, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details
void Test(Span<byte> source)
{
    for (int i = 0; i < 100; i++)
        source[i] = 0;
}

Codegen:

; Method Progr:Test(System.Span`1[ubyte]):this
G_M11420_IG01:              
       4883EC28             sub      rsp, 40
G_M11420_IG02:              
       33C0                 xor      eax, eax
                            align    [0 bytes for IG03]
G_M11420_IG03:              
       3B4208               cmp      eax, dword ptr [rdx+08H]
       7317                 jae      SHORT G_M11420_IG05
       488B0A               mov      rcx, bword ptr [rdx]
       448BC0               mov      r8d, eax
       42C6040100           mov      byte  ptr [rcx+r8], 0
       FFC0                 inc      eax
       83F864               cmp      eax, 100
       7CE9                 jl       SHORT G_M11420_IG03
G_M11420_IG04:              
       4883C428             add      rsp, 40
       C3                   ret      
G_M11420_IG05:              
       E829C64C5F           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3     
; Total bytes of code: 40

Here it is expected to have two loops and one of them without bound checks.

cc @BruceForstall @dotnet/jit-contrib

Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member

Is this a regression? Back when we first implemented span we deliberately patterned the IR after array so that we'd get all the same optimizations.

@SingleAccretion
Copy link
Contributor

No, it has always been like that.

@BruceForstall BruceForstall changed the title Loops aren't clonned for Spans Loops aren't cloned for Spans Mar 5, 2023
@BruceForstall BruceForstall self-assigned this Mar 5, 2023
@BruceForstall BruceForstall added this to the 8.0.0 milestone Mar 5, 2023
@BruceForstall BruceForstall removed the untriaged New issue has not been triaged by the area owner label Mar 5, 2023
@BruceForstall BruceForstall modified the milestones: 8.0.0, 9.0.0 Jul 31, 2023
@JulieLeeMSFT JulieLeeMSFT modified the milestones: 9.0.0, Future May 7, 2024
@JulieLeeMSFT JulieLeeMSFT added the Priority:2 Work that is important, but not critical for the release label May 7, 2024
@JulieLeeMSFT JulieLeeMSFT modified the milestones: Future, 9.0.0 May 7, 2024
@JulieLeeMSFT JulieLeeMSFT removed the Priority:2 Work that is important, but not critical for the release label Jun 13, 2024
@JulieLeeMSFT JulieLeeMSFT modified the milestones: 9.0.0, 10.0.0 Jul 25, 2024
@AndyAyersMS
Copy link
Member

No, it has always been like that.

And, guess it was me who made it so: dotnet/coreclr#19493

@PavelCibulka
Copy link

What about having the JIT recognize Math.Min and eliminate the bounds check. This way, a programmer can guarantee bounds by using Math.Min.

void Test(Span<byte> source)
{
    int length = Math.Min(100, source.Length);
    for (int i = 0; i < length; i++)
        source[i] = 0;
}

void Test2(Span<byte> target ,Span<byte> source)
{
    int length = Math.Min(target.Length, source.Length);
    for (int i = 0; i < length; i++)
        target[i] = source[i];
}

Also it would be nice if JIT also recognize Downwards loop variants.

@EgorBo
Copy link
Member Author

EgorBo commented Dec 22, 2024

What about having the JIT recognize Math.Min and eliminate the bounds check. This way, a programmer can guarantee bounds by using Math.Min.

It is expected to be handled by Loop Clonning as is today, just not for spans (yet). E.g. for Arrays:

void Test(byte[] source)
{
    int length = Math.Min(100, source.Length);
    for (int i = 0; i < length; i++)
        source[i] = 0;
}

Emits:

; Method Program:Test(ubyte[]):this (FullOpts)
G_M50342_IG01:  ;; offset=0x0000
       sub      rsp, 40

G_M50342_IG02:  ;; offset=0x0004
       mov      eax, dword ptr [rdx+0x08]
       mov      ecx, 100
       cmp      eax, 100
       cmovl    ecx, eax
       xor      eax, eax
       test     ecx, ecx
       jle      SHORT G_M50342_IG05

G_M50342_IG03:  ;; offset=0x0018
       cmp      dword ptr [rdx+0x08], ecx
       jl       SHORT G_M50342_IG06
       align    [3 bytes for IG04]

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
G_M50342_IG04:  ;; offset=0x0020
       mov      r8d, eax
       mov      byte  ptr [rdx+r8+0x10], 0
       inc      eax
       cmp      eax, ecx
       jl       SHORT G_M50342_IG04
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

G_M50342_IG05:  ;; offset=0x002F
       add      rsp, 40
       ret      

G_M50342_IG06:  ;; offset=0x0034
       cmp      eax, dword ptr [rdx+0x08]
       jae      SHORT G_M50342_IG07
       mov      r8d, eax
       mov      byte  ptr [rdx+r8+0x10], 0
       inc      eax
       cmp      eax, ecx
       jl       SHORT G_M50342_IG06
       jmp      SHORT G_M50342_IG05

G_M50342_IG07:  ;; offset=0x004A
       call     CORINFO_HELP_RNGCHKFAIL
       int3     
; Total bytes of code: 80

It is still emitting the slow loop here which is redundant, but it shouldn't affect performance

@EgorBo
Copy link
Member Author

EgorBo commented Dec 22, 2024

cc @AndyAyersMS it looks like we can actually fold the "clonned" condition here via assertions (over PHI node).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

6 participants