-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow SIMD-returning calls as arguments #74184
Allow SIMD-returning calls as arguments #74184
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsAs of this change we handle all relevant ABI scenarios.
Fixes #74126.
|
As of this change we handle all relevant ABI scenarios. 1) Windows x64: - SIMD8: returned and passed as "TYP_LONG", fine. - SIMD12 / SIMD16 / SIMD32: returned and passed via a return buffer, fine. 2) Unix x64: - SIMD8: returned and passed in one FP register, fine. - SIMD12 / SIMD16, Vector4: returned and passed in two FP registers, fine. - SIMD16, Vector128 / SIMD32: returned and passed via a return buffer, fine. 3) x86: - SIMD8: can be returned via two registers or a return buffer (and is always passed on stack), both are fine. - SIMD12/SIMD16/SIMD32: returned via a return buffer, passed on stack, fine. 4) ARM64: - SIMD8, Vector2: returned in two FP registers (and passed as such or "TYP_LONG" under Windows varargs), fine. - SIMD8, Vector64: returned in one FP register, can be passed as such or as "TYP_LONG" under Windows varargs. The latter case is now handled correctly in "Lowering::LowerArg". - SIMD12: returned in three FP registers, passed as such or in two integer registers under Windows varargs, fine. - SIMD16, Vector4: returned in four FP registers, passed as such, or in two integer registers under Windows varargs, fine. - SIMD16, Vector128: returned in one FP register, passed as such, or in two integer registers under Windows varargs, fine (morph will decompose the varargs case into a `FIELD_LIST` via a temp).
124cb3e
to
f91c6ab
Compare
SPMI failure is pre-existing (empty OSX ARM64 contexts?),
I am assuming we'll want to backport this to 7.0. |
@BruceForstall, please triage a milestone. |
cc @tannergooding @dotnet/jit-contrib |
Just noting that SIMD16/SIMD32 being returned via an output buffer is an "it depends" scenario. While
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple question
@@ -1419,7 +1419,7 @@ GenTree* Lowering::LowerFloatArg(GenTree** pArg, CallArg* callArg) | |||
break; | |||
} | |||
GenTree* node = use.GetNode(); | |||
if (varTypeIsFloating(node)) | |||
if (varTypeUsesFloatReg(node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be varTypeUsesFloatArgReg()
? Effectively it's no difference, except for LoongArch64.
Unrelated, but the code below seems odd:
if (node->TypeGet() == TYP_DOUBLE)
{
currRegNumber = REG_NEXT(REG_NEXT(currRegNumber));
regIndex += 2;
}
else
{
currRegNumber = REG_NEXT(currRegNumber);
regIndex += 1;
}
I would expect the TYPE_DOUBLE
== 2 registers code to only apply to arm32, but it's not ifdef'ed that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect the
TYPE_DOUBLE
== 2 registers code to only apply to arm32, but it's not ifdef'ed that way.
Indeed, I thought the same but decided against #ifdef
ing it in this change to keep the scope down.
It so happens that we will never have DOUBLE
here on ARM64 (which is the case of interest) because morph will construct the FIELD_LIST
with LONG
s. Looking what happens under Linux ARM soft FP, I see the same, so it seems likely that this code is actually dead.
Should this be
varTypeUsesFloatArgReg()
?
I'd think it's better with varTypeUsesFloatReg
. varTypeUsesFloatArgReg
has the meaning of "can this type be used as an argument from an FP register file", while here we're asking the question of "does this node define an FP register".
@@ -1441,7 +1441,7 @@ GenTree* Lowering::LowerFloatArg(GenTree** pArg, CallArg* callArg) | |||
// List fields were replaced in place. | |||
return arg; | |||
} | |||
else if (varTypeIsFloating(arg)) | |||
else if (varTypeUsesFloatReg(arg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
case GT_CALL: | ||
// Argument lowering will deal with register file mismatches if needed. | ||
assert(varTypeIsSIMD(origType)); | ||
break; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BruceForstall it's this change that fixes the original failure.
/backport to release/7.0 |
Started backporting to release/7.0: https://github.com/dotnet/runtime/actions/runs/2922063287 |
@SingleAccretion Feel free to amend #74520 with additional justification for porting the fix back to 7.0 |
As of this change we handle all relevant ABI scenarios.
TYP_LONG
, fine.TYP_LONG
under Windows varargs), fine.TYP_LONG
under Windows varargs.The latter case is now handled correctly in
Lowering::LowerArg
.(morph will decompose the varargs case into a
FIELD_LIST
via a temp).Fixes #74126.