-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Conversation
22909c0
to
ac77102
Compare
ac77102
to
0203bc4
Compare
@dotnet/jit-contrib @tannergooding @sdmaclea @TamarChristinaArm @jkotas @jkoritzinsky |
src/jit/compiler.h
Outdated
bool _isHfaArg : 1; // True when the argument is an HFA type. | ||
bool _isDoubleHfa : 1; // True when the argument is an HFA, with an element type of DOUBLE. | ||
bool _isHfaArg : 1; // True when the argument is an HFA type. | ||
unsigned char _hfaLogSize : 2; // The base 2 log of the HFA element size in 4-byte units (e.g. 0 for TYP_FLOAT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there other terminology that could be used here? "size" most of the type refers to byte size. In call morphing code "size" also refers to the number of slots. And now here it refers to the number of 4-byte units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about _hfaLogWidth
? Or _hfaLogRelSize
for relative size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, width is also used in some cases to denote byte size (some block ops code does that). But at least it's not used within LclVarDsc
(which has size and exactSize) so it's a better alternative.
Though I'm not sure why this is needed to begin with. Is it just to "compress" the information down to 2 bits in the bitfield?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's just to keep the information small as opposed to making it a var_types
field, or creating a new enum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, then leave it as it is. I have a feeling that this HFA info will end up in the ClassLayout
thing I have in #21705 sooner or later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In looking at my response, I'm thinking a new enum would be better - it could be called something like HfaElemKind
and have values HFA_ELEM_FLOAT
, HFA_ELEM_DOUBLE
and HFA_ELEM_VECTOR128
. And then we could rely on the C++ compiler to generate good code rather than trying to pre-optimize by using shifts. (I try to make it a principle to rely on the code generator and not pre-optimize, but I don't always manage to do so!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, an enum should be fine as long as it has underlying type unsigned char
to match the other bitfield members. Otherwise AFAIR VC++ does not pack data member of different types into the same byte of the bitfield.
Ah great, I'm out of office until Monday but will take a look first thing Monday morning! |
7597c5b
to
f95753a
Compare
82c3f66
to
f1b41da
Compare
@dotnet-bot test Windows_NT arm Cross Checked Innerloop Build and Test |
6aaedf9
to
e31c470
Compare
I believe I've incorporated all the feedback, and this is now green. I'm going to do some outerloop testing, but would appreciate any additional feedback on the changes. |
/azp run coreclr-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Are there any tests which are testing this is working correctly? Last I knew the tests were missing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM and looks to match the spec.
I think it is a bit odd that we aren't fully classifying everything independently as HVAs vs HFAs (it looked like we were sharing the logic a lot, where possible, since they are very similar), but I don't think it is something we need to block this PR on. Ultimately, I think it would be good to have all the logic more clearly defined/classified as it might make the code easier to follow and make it easier to bring new platforms online...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments/questions
src/vm/callingconvention.h
Outdated
@@ -50,8 +50,7 @@ struct ArgLocDesc | |||
#endif // UNIX_AMD64_ABI | |||
|
|||
#if defined(_TARGET_ARM64_) | |||
bool m_isSinglePrecision; // For determining if HFA is single or double | |||
// precision | |||
int m_hfaFieldSize; // Size of HFA field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful if the comment stated "... in bytes" (assuming that's what it is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is never negative, right, so should this be declared unsigned
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense.
src/vm/callingconvention.h
Outdated
case ELEMENT_TYPE_R4: hfaFieldSize = 4; break; | ||
case ELEMENT_TYPE_R8: hfaFieldSize = 8; break; | ||
#ifdef _TARGET_ARM64_ | ||
case ELEMENT_TYPE_VALUETYPE: hfaFieldSize = 16; break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems odd that ELEMENT_TYPE_VALUETYPE
implies hfaFieldSize = 16
. Don't you need to somehow look up the value type size? I guess you're just overloading this existing type for this particular purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code (figuring out hfaFieldSize) is also duplicated 3 times; can it be extracted to a func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you're just overloading this existing type for this particular purpose?
Yes, it's a bit of a hack, but avoids having to create a new VM type.
@@ -580,7 +587,13 @@ class EEClassLayoutInfo | |||
void SetNativeHFAType(CorElementType hfaType) | |||
{ | |||
LIMITED_METHOD_CONTRACT; | |||
m_bFlags |= (hfaType == ELEMENT_TYPE_R4) ? e_R4_HFA : e_R8_HFA; | |||
switch (hfaType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function ever called more than once?
Seems you could add assert(m_bFlags & e_HFATypeFlags == 0)
if not.
or, change the assigns to something like m_bFlags = (m_bFlags & ~e_HFATypeFlags) | e_..._HFA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it should never be called more than once. I'll add the assert.
HFA_ELEM_NONE, | ||
HFA_ELEM_FLOAT, | ||
HFA_ELEM_DOUBLE, | ||
HFA_ELEM_SIMD16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is HFA_ELEM_SIMD16
only arm64? Should it be #ifdef'ed
, or is that too much? Maybe asserts someplace that other platforms don't see it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't seem worth ifdef'ing this out
// | ||
// Return Value: | ||
// On Arm64 - Returns 1-4 indicating the number of register slots used by the HFA | ||
// On Arm32 - Returns the total number of single FP register slots used by the HFA, max is 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This asymmetry is kind of annoying...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree, but it's pre-existing and I'm pretty sure that changing it would be non-trivial.
slots = lvExactSize >> 3; | ||
break; | ||
case HFA_ELEM_SIMD16: | ||
slots = lvExactSize >> 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you add asserts that lvExactSize is a multiple of the appropriate size, e.g., assert(lvExactSize % 16 == 0)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense.
Extend HFA support to support vectors as well as floating point types. This requires that the JIT recognize vector types even during crossgen, so that the ABI is supported consistently. Fix #16022
23b7d7f
to
dd186f2
Compare
The "Test pri0 Linux arm64 checked" leg failed with a timeout in /JIT/Methodical/tailcall/Desktop/_il_relthread-race/_il_relthread-race.sh |
test Windows_NT arm64 Cross Checked Innerloop Build and Test |
/azp run coreclr-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
test Windows_NT arm64 Cross Checked Innerloop Build and Test |
@BruceForstall @dotnet/jit-contrib - I've addressed all of the PR feedback, and the only remaining outerloop failures are the recently added test: I have the changes from #23881, but this test is a pri1 test and doesn't seem to have been run on that PR (or since). The failure is:
Can I get another review? |
@dotnet/jit-contrib ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good
Thanks @briansull ! |
@CarolEidt I am seeing 8 R2R Linux arm64 and Linux musl_arm64 tests failing during crossgen run on this commit validation https://dev.azure.com/dnceng/public/_build/results?buildId=157586&view=ms.vss-test-web.build-test-results-tab |
@echesakovMSFT - I'll take a look; are these not run in outerloop testing? |
@CarolEidt We don't run R2R tests in coreclr-outerloop, only during commit validations and official builds. @RussKeldorph We probably should add coreclr-outerloop-r2r and coreclr-outerloop-r2r-jistress/gcstress build definitions for folks who want to trigger those and also have them running on schedule. |
@echesakovMSFT - can you remind me how to invoke a single test with R2R? Is it just the crossgen command with the appropriate "/Platform_Assemblies_Paths" option? Or is there a better way? |
@CarolEidt To crossgen the test, set |
* Support for Arm64 Vector ABI Extend HFA support to support vectors as well as floating point types. This requires that the JIT recognize vector types even during crossgen, so that the ABI is supported consistently. Also, fix and re-enable the disabled Arm64 Simd tests. Fix dotnet/coreclr#16022 Commit migrated from dotnet/coreclr@5be6b66
Extend HFA support to support vectors as well as floating point types.
This requires that the JIT recognize vector types even during crossgen,
so that the ABI is supported consistently.
Fix #16022