-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotate handle indirections with GTF_IND_INVARIANT and add validation to fgDebugCheckFlags #42021
Conversation
5df3a8c
to
61dcb3c
Compare
|
@dotnet/jit-contrib PTAL |
Nice improvement. Non-crossgen diffs would be somewhat more relevant for perf... can you share those? Does the allocator somehow know that these values don't need to be spilled? |
Fixes #41626? |
These indirection are now correctly marked as invariant, so we know that their values can't be changed and we give the same value number and we create CSE's for the multiple reads. (when profitable) I haven't examined the register allocator code, but it doesn't currently access the GTF_IND_INVARIANT flag. |
@sandreenko Yes This fixes #41626 and adds additional checking to insure that we always set these flags correctly going forwards. |
Sample Regression: |
@AndyAyersMS |
Related issue: |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good but I have some questions I'd like you to address.
src/coreclr/src/jit/gentree.h
Outdated
#define GTF_ICON_PSTR_HDL 0x70000000 // GT_CNS_INT -- constant is a ptr to a string handle | ||
#define GTF_ICON_PTR_HDL 0x80000000 // GT_CNS_INT -- constant is a ldptr handle | ||
#define GTF_ICON_CONST_PTR 0x70000000 // GT_CNS_INT -- constant is a pointer to immutable data, (e.g. IAT_PPVALUE) | ||
#define GTF_ICON_PTR_GLOBAL 0x80000000 // GT_CNS_INT -- constant is a pointer to mutable data from the VM state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: always put the _PTR
suffix last?
#define GTF_ICON_PTR_GLOBAL 0x80000000 // GT_CNS_INT -- constant is a pointer to mutable data from the VM state | |
#define GTF_ICON_GLOBAL_PTR 0x80000000 // GT_CNS_INT -- constant is a pointer to mutable data from the VM state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I will make those changes
src/coreclr/src/jit/importer.cpp
Outdated
@@ -7353,7 +7355,8 @@ GenTree* Compiler::impImportStaticFieldAccess(CORINFO_RESOLVED_TOKEN* pResolvedT | |||
|
|||
var_types handleTyp = (var_types)(varTypeIsGC(lclTyp) ? TYP_BYREF : TYP_I_IMPL); | |||
op1 = gtNewOperNode(GT_IND, handleTyp, op1); | |||
op1->gtFlags |= GTF_IND_INVARIANT | GTF_IND_NONFAULTING; | |||
op1->gtFlags |= GTF_IND_NONFAULTING; | |||
noway_assert(!"Unreachable on CoreCLR"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is unreachable why not just remove it all together? We have already broken the ability to port the jit back to desktop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this code path is only good for the desktop.
Instead it seems that the CoreCLR VM doesn't use this code path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add an assert and remove the unreachable block,
I found another section of code that asserted for the unreachable case.
src/coreclr/src/jit/morph.cpp
Outdated
op1->gtFlags |= GTF_IND_INVARIANT; | ||
op1->gtFlags |= GTF_IND_NONFAULTING; | ||
|
||
assert(!"Unreached GTF_IND_INVARIANT"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto for this, if it's now unreachable, why not remove it all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add an assert and remove the unreachable block,
@@ -4488,10 +4495,20 @@ GenTree* Lowering::LowerNonvirtPinvokeCall(GenTreeCall* call) | |||
break; | |||
|
|||
case IAT_PPVALUE: | |||
// ToDo: Expanding an IAT_PPVALUE here, loses the opportunity | |||
// to Hoist/CSE the first indirection as it is an invariant load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to open an issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would speed up PInkvoke calls that occur inside of a loop in a crossgen-ed assembly.
I will investigate how often this condition actually occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following two methods in System.Private.Corelib have PInvoke calls inside of a loop:
Top method regressions (bytes):
12 ( 3.74% of base) : System.Private.CoreLib.dasm - System.GC:GetTotalMemory(bool):long
4 ( 0.35% of base) : System.Private.CoreLib.dasm - System.RuntimeType:GetInterfaceMap(System.Type):System.Reflection.InterfaceMapping:this
{ | ||
// Note that eGetCPString isn't currently implemented on Linux/ARM | ||
// Note that eeGetCPString isn't currently implemented on Linux/ARM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know why it's not implemented? Would it be simple to fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how useful this EE method actually is
In my usage we rarely end up printing the actual string.
Currently it can't ever be printed at crossgen time, only at JIT time can we print the string.
Prior to my change we would try to print every string as the empty string
We get a null terminated string, and then print it as as ""
… are embedding a pointer to mutable data from the VM state
…stant is a pointer to immutable data, (e.g. IAT_PPVALUE)
…her than class, method or field.
… offset In gtNewStringLiteralNode - Added assert for non-CoreRT case of a IAT_VALUE of a TYP_REF
… exits as this can currently significantly limit our hoisting optimization.
…ctCall and comment in Lowering::LowerNonvirtPinvokeCall where we currently allow this late expansion.
…f GTF_IND_INVARIANT Disable the GTF_IND_INVARIANT for the cases that currentlt do not set it correctly.
…ugCheckFlags (disabled) Minor changes with GTF_IND_NONFAULTING Added assert(!"Check this") Add check for GTF_ICON_PINVKI_HDL // Eventually remove, as this is immutable
Added assert(!"Unreached GTF_IND_INVARIANT") in fgMorphField Fixed IAT_PPVALUe and modified IAT_PVALUE cases in fgMorphLeaf for GT_FTN_ADDR
This was previously #ifdef out ran jit-format
…LUE case Removed #if 0 code
…value Remove the unreachable block
…rectly, set GTF_GLOB_REF for all class static indirections Check that all class static indirections have GTF_GLOB_REF set Changed two asserts to be noway_asserts Changed the indirection that points to the array initialization blob to be a GTF_ICON_CONST_PTR instead of a GTF_ICON_STATIC_HDL. Changed the indirection that fetches the field offset value for ReadyToRun to be a GTF_ICON_CONST_PTR instead of a GTF_ICON_FIELD_HDL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change, LGTM.
Agree with Andy's review, I do not like assert(!"error message")
blocks because it is not clear when we will be able to delete them in the future if we do not create an issue or TODO for them. So they are likely to be there forever once introduced.
//------------------------------------------------------------------------------ | ||
// fgDebugCheckDispFlags : Display the flags that we are checking | ||
// | ||
// Arguments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the header was not finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update this
// Expanding an IAT_PPVALUE here, will lose the opportunity | ||
// to Hoist/CSE the first indirection as it is an invariant load | ||
// | ||
assert(!"IAT_PPVALUE case in LowerDirectCall"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: How do we usually check such JIT expectations about VM return values? The current VM implementation (CEEInfo::getFunctionEntryPoint
, ZapInfo::getFunctionEntryPoint
) does not use IAT_PPVALUE
, would it make sense to assert that pResult->accessType != IAT_PPVALUE
in the VM code before we exit these functions and change this block to a unreached()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that there is a hard and fast contract in this area.
The JIT is free to implement all possible variants and the VM is free to only utilize one or two and never use the third choice. When I noticed that waiting all the way until lower to materialize these indirections would not be optimal for codegen, I added the assert to find out if it was being used. It turns out in one case it was used and in another case it wasn't.
I don't think that this deserves an assert in the VM for this behavior. I like to reserve asserts for things that are actually incorrect in some semantic way. In any case if someone changes the VM and now asks for a double indirection we will get an assert in the JIT which is fine, because that is the component that has an issue with the double indirection case.
424a428
to
65bcaea
Compare
Rename of GTF_ICON_GLOBAL_PTR) for emitarm64.ccp
fieldMayOverlap ? FieldSeqStore::NotAField() : GetFieldSeqStore()->CreateSingleton(symHnd); | ||
tree->AsClsVar()->gtFieldSeq = fieldSeq; | ||
} | ||
// We should always be able to access this static field address directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also delete ppIndirection
argument of getFieldAddress
method on JIT/EE interface if you are removing the JIT's ability to handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we want to permanently want to remove this as a possibility from the JIt to EE interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we want to permanently want to remove this as a possibility from the JIt to EE interface.
I agree with that, so why delete the code to handle it?
I am not convinced that this path is actually fully unreachable today. I won't be surprised if there is a way to hit this path in some cases under crossgen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two JIT team members that asked for this code to be removed instead of having an assert that says it is unreached.
This change allows us to CSE many additional indirections involving class, method and other handles.
It is a performance improvement