-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Block the hoisting of TYP_STRUCT rvalues in loop hoisting #23739
Conversation
src/jit/optcse.cpp
Outdated
structPassingKind howToPassStruct; | ||
var_types structBaseType = getArgTypeForStruct(structHnd, &howToPassStruct, false, size); | ||
|
||
// Only allow structs whose storage can be held in a register to be CSE candidates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably that's just because it's not very useful to CSE something that doesn't fit in a register. Or is there a correctness issue as well?
I suppose it would make sense to CSE a struct indir if the new temporary variable can then be promoted. But that's probably quite complicated to get right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, There currently is a correctness issue as well when we perform loop hoisting we place a l-value which is a load (a GT_IND of a TYP_STRUCT ) into the loop pre-header.
This indirection doesn't get handled properly (or removed) and we end up asserting about a missing register during codegen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see bool Compiler::optHoistLoopExprsForTree()
where we currently require:
// Tree must be a suitable CSE candidate for us to be able to hoist it.
treeIsHoistable &= optIsCSEcandidate(tree);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, of course, loop hoisting will hoist the LHS of the struct assignment but that does not have a struct handle so there's no way for it to actually work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not clear to me if the above is a legality check followed by a profitability check, or if both are legality checks. Can you clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@briansull - in our discussions of this, it seemed like the fundamental issues are:
- The struct handling in the JIT relies on the fact that a struct r-value will always be used in a context where its size and/or handle are carried by an l-value (e.g. a block node on the lhs of an assign, or a block store).
- An unused hoisted rvalue doesn't get removed. This leaves the problematic handle-less size-less struct indirection.
Before taking this fix, I'd like us to understand what diffs this creates, and file an issue to address the above problems.
As we improve our struct codegen, I'd expect us to see more cases where hoisting struct values would be beneficial, as we can't (and probably never will) promote all structs that might be created redundantly (within a loop or otherwise).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An unused hoisted rvalue doesn't get removed. This leaves the problematic handle-less size-less struct indirection.
Not only that it doesn't get removed but it simply cannot be used at all. Because loop hoisting is expecting CSE to assign the hoisted tree to a variable but CSE can't do that because there's struct type information available in the r-value.
To make this work with the current loop hoisiting implementation we'd need to stop discarding struct information from r-values (converting OBJ
to TYP_STRUCT
IND
). That's doable and IMO desirable, discarding struct information is just risky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I have seen many examples where manual struct CSEs are profitable -- for example the before version of this did not cache Start
and End
as local structs. IL semantics for structs tends to lead to lots of struct copies and often many of those copies are identical.
coreclr/src/System.Private.CoreLib/shared/System/Range.cs
Lines 105 to 122 in 0d581c7
[MethodImpl(MethodImplOptions.AggressiveInlining)] | |
public OffsetAndLength GetOffsetAndLength(int length) | |
{ | |
int start; | |
Index startIndex = Start; | |
if (startIndex.IsFromEnd) | |
start = length - startIndex.Value; | |
else | |
start = startIndex.Value; | |
int end; | |
Index endIndex = End; | |
if (endIndex.IsFromEnd) | |
end = length - endIndex.Value; | |
else | |
end = endIndex.Value; | |
if ((uint)end > (uint)length || (uint)start > (uint)end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, should perhaps this fix be restricted to loop hoisting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have coded up an alternative fix that blocks the hoisting of TYP_STRUCT r-values.
src/jit/optcse.cpp
Outdated
structPassingKind howToPassStruct; | ||
var_types structBaseType = getArgTypeForStruct(structHnd, &howToPassStruct, false, size); | ||
|
||
// Only allow structs whose storage can be held in a register to be CSE candidates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "register" referring just to integer registers and to the lower element of vector registers (for float/double)?
I imagine we may want to support CSE (and other optimizations) for full vector registers eventually...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call to getArgTypeForStruct
should identify the register sized SIMD struct types as
(howToPassStruct == SPK_PrimitiveType)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it doesn't appear that we support passing such vector types in registers according to getArgTypeForStruct
and getPrimitiveTypeForStruct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe @CarolEidt is working on this for the Vector ABI support.
cc @dotnet/jit-contrib. @briansull is it possible to create a CoreClr repro based on desktop failures for this issue? |
I will investigate creating a CoreCLR repro test case |
Here's a repro in case you don't have one already: struct Str
{
float a, b, c, d, e, f;
}
class Cls
{
public Str s;
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void Sink<T>(ref T t)
{
}
static void Main() => new Program().Test(new Cls());
[MethodImpl(MethodImplOptions.NoInlining)]
void Test(Cls c)
{
Str l1 = default;
Str l2 = default;
Str l3 = default;
for (int i = 0; i < 10; i++)
{
l1 = c.s;
l2 = c.s;
l3 = c.s;
}
Sink(ref l1);
Sink(ref l2);
Sink(ref l3);
} asserts:
|
Thank you @mikedn |
Test case added |
PTAL @dotnet/jit-contrib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this cause any diffs?
src/jit/optcse.cpp
Outdated
structPassingKind howToPassStruct; | ||
var_types structBaseType = getArgTypeForStruct(structHnd, &howToPassStruct, false, size); | ||
|
||
// Only allow structs whose storage can be held in a register to be CSE candidates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not clear to me if the above is a legality check followed by a profitability check, or if both are legality checks. Can you clarify?
@dotnet-bot Test Windows_NT arm Cross Checked Innerloop Build and Test |
On the desktop there are no reported Asm Diffs in the frameworks. |
I have coded up an alternative fix that blocks the hoisting of TYP_STRUCT r-values. Working on diffs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - with one comment suggestion
src/jit/optimizer.cpp
Outdated
{ | ||
GenTreeCall* call = tree->AsCall(); | ||
if (call->gtCallType != CT_HELPER) | ||
// We cannot hoist an r-value of TYP_STRUCT, as it is illegal to do so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would reword this comment as "We cannot hoist an r-value of TYP_STRUCT, as they currently do not carry full descriptors of the struct type" or something to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Added test case GitHub_23739.cs
Rebased and squashed commits No Asm diffs |
Block the hoisting of TYP_STRUCT rvalues in loop hoisting Commit migrated from dotnet/coreclr@cd41863
PR #22255 introduced test regressions:
Assertion failed 'tree->gtHasReg()' in 'Program:Test(ref):this' (IL size 80)