Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM64 altjit uses wrong return type for a wrapped Vector128. #38980

Open
sandreenko opened this issue Jul 9, 2020 · 7 comments
Open

ARM64 altjit uses wrong return type for a wrapped Vector128. #38980

sandreenko opened this issue Jul 9, 2020 · 7 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Milestone

Comments

@sandreenko
Copy link
Contributor

sandreenko commented Jul 9, 2020

For a test:

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static Vector128Wrapped TestReturnVector128WrappedPromoted()
    {
        var a = new Vector128Wrapped
        {
            f = Vector128<short>.Zero
        };
        return a;
    }

arm64 native will use:

N001 (  3,  2) [000015] ------------        t15 =    LCL_FLD   simd16 V00 loc0         [+0]
                                                  *    simd16 V00.f (offs=0x00) -> V03 tmp1         
                                                  /--*  t15    simd16 
N002 (  4,  3) [000016] ------------              *  RETURN    simd16 $1c1

and generate

IN0003: 000018                    ldr     q16, [fp,#16]	// [V00 loc0]
IN0004: 00001C                    mov     v0.16b, v16.16b

when altjit will use:

N001 (  3,  2) [000015] -c-----N----        t15 =    LCL_VAR   struct<Vector128Wrapped, 16>(P) V00 loc0         
                                                  *    simd16 V00.f (offs=0x00) -> V03 tmp1          $1c0
                                                  /--*  t15    struct 
N002 (  4,  3) [000016] ------------              *  RETURN    struct $1c1

and generate:

IN0003: 000018                    ldr     x0, [fp,#16]	// [V00 loc0]
IN0004: 00001C                    ldr     x1, [fp,#24]	// [V00 loc0+0x08]

That was found in #37745.

The issue is probably in this VM code:

#ifdef TARGET_ARM64
MethodTable* pMT;
#if defined(FEATURE_HFA)
pMT = pByValueClassCache[i];
#else
pMT = pFD->LookupApproxFieldTypeHandle().AsMethodTable();
#endif
int thisElemSize = pMT->GetVectorSize();
if (thisElemSize != 0)
{
fieldHFAType = (thisElemSize == 8) ? CORINFO_HFA_ELEM_VECTOR64 : CORINFO_HFA_ELEM_VECTOR128;
}
else
#endif // TARGET_ARM64

that we run on native arm64, but have disabled for altjit.

The issue prevents us from an optimization that would fail with altjit with the current implementation.

category:correctness
theme:altjit
skill-level:intermediate
cost:medium

@sandreenko sandreenko added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 9, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jul 9, 2020
@sandreenko
Copy link
Contributor Author

@CarolEidt , @BruceForstall could you please take a look? what do you think is the best solution here?
I could think of a few options:

  • allow the optimization only when altjit is not used (would be hard to keep altjit in sync after);
  • support RETURN struct as x0,x1 (simd16 in a register) specifically for altjit in codegen (again, that code path won't get tested in ci);
  • somehow teach VM to give us a correct answer (not sure if possible);
  • don't use VM and write our own function that will respect altjit flag?

cc @dotnet/jit-contrib

@kunalspathak
Copy link
Member

This is an interesting finding where altjit generated different code than that on target machine. I can imagine that it is difficult to keep altjit in sync, I wonder how many such cases might be there in JIT code base where code generated differs. I think it is of lesser importance because altjit is just used as convivence tool for JIT developers (at least that's what my understanding of altjit is), so it is ok to find such cases on need basis.

@BruceForstall
Copy link
Member

I think a simple solution in this case is to change CheckForHFA to use #ifdef TARGET_64BIT instead of #ifdef TARGET_ARM64. It looks like this is already done for MethodTable::GetVectorSize. This would work because we run the arm64 altjit against the x64 VM and the arm32 altjit against the x86 VM.

@sandreenko
Copy link
Contributor Author

I think a simple solution in this case is to change CheckForHFA to use #ifdef TARGET_64BIT instead of #ifdef TARGET_ARM64. It looks like this is already done for MethodTable::GetVectorSize. This would work because we run the arm64 altjit against the x64 VM and the arm32 altjit against the x86 VM.

Won't it produce incorrect result on x64 windows?

@BruceForstall
Copy link
Member

Those functions should never be used on x86/x64 except by the altjits. It's not under #ifdef FEATURE_HFA solely to allow the altjit to call getHFAType across the JIT-EE interface.

@sandreenko sandreenko removed the untriaged New issue has not been triaged by the area owner label Jul 13, 2020
@sandreenko sandreenko added this to the 6.0.0 milestone Jul 13, 2020
@sandreenko
Copy link
Contributor Author

I have tried Bruce's suggestion, but it was not enough, arm64altjit is returning false from IsIntrinsicType for Vector128Wrapped, when arm64 returns true. I was trying to debug it further but was blocked by a strange error that I am investigating now.
In any case, I think we don't have enough time in 5.0 for the optimization that was blocked by this issue, so the issue could wait as well (or be fixed as a bug in a later previes, without the optimization).

@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@BruceForstall BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Nov 7, 2020
@JulieLeeMSFT JulieLeeMSFT added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Mar 23, 2021
@sandreenko sandreenko added Priority:3 Work that is nice to have and removed needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Apr 1, 2021
@sandreenko sandreenko modified the milestones: 6.0.0, Future Apr 1, 2021
@sandreenko
Copy link
Contributor Author

It is altjit only, pushing to future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Projects
None yet
Development

No branches or pull requests

5 participants