This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
ARM64: Enable End-To-End ReadyToRun (R2R) Crossgen #5020
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@dotnet-bot test Windows_NT arm64 Checked |
@dotnet/arm64-contrib @dotnet/jit-contrib PTAL. |
changes in zapimport look good to me |
Looks Good |
@@ -2870,6 +2871,15 @@ struct GenTreeCall final : public GenTree | |||
bool IsSameThis() { return (gtCallMoreFlags & GTF_CALL_M_NONVIRT_SAME_THIS) != 0; } | |||
bool IsDelegateInvoke(){ return (gtCallMoreFlags & GTF_CALL_M_DELEGATE_INV) != 0; } | |||
bool IsVirtualStubRelativeIndir() { return (gtCallMoreFlags & GTF_CALL_M_VIRTSTUB_REL_INDIRECT) != 0; } | |||
#ifdef FEATURE_READYTORUN_COMPILER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that we want to have #ifdef's around this code.
38df707
to
c0ef1fe
Compare
Fixes https://github.com/dotnet/coreclr/issues/4649 The immediate issues was NYI on genEmitHelperCalls. The initial implementation for the missing part was enough to just crossgen System.dll. But running tests revealed various issues in crossgened binaries (R2R). Most common user/helper calls in R2R are represented as indirect calls similar to interface call using virtual stub dispatch cell -- thunk/helper needs a indirect cell address to update the final target address on the data location. `IsDelayLoadHelper` and `IsLazyHelper` belong to this case. Instead of passing such parameter, x64/x86 uses an encoding trick -- it assumes the call is dispatched like `call [addr]`. So from the return address, runtime could extract indirect cell address. Unfortunately this is not an option for arm64 (actually arm as well but I haven't fixed it in this change) where indirect call on memory is not encodable. So, I made the following changes: 1. For the call requiring that needs to pass indirect cell address, I tagged the call tree via `setR2RRelativeIndir`. Tried to be comprehensive, but I may miss something. Currently, it includes a regular call and various helpers for (virtual) load function pointer/static data access, etc. Hopely we change JIT/EE interface somehow that gives us such explicit information. 2. Use the X11 to record indirect cell address for such call tree in lower similar to VSD. 3. Fixed encodings `ZapIndirectHelperThunk`. In particular the immediate value/offset for `ldr` should be scaled down 4 times since HW will scale it 4 times. 4. Implement `genEmitHelperCalls` for indirect case. This is not the case requiring indirect cell address. This is the case we inlined the indirect helper thunk for the speed. I'm seeing the case for size opt helper call, we invoke a direct call to such thunk which actually uses x12 to dispatch the final target. Likewise, I used x12 for this expansion which seems a trash register that is not overlapped with arugments with jit helpers like writer barriers. With this change, I've tested various cases/scenraios locally. Also I've verified all tests are passed against mscorlib.ni.dll and System.ni.dll.
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
ARM64: Enable End-To-End ReadyToRun (R2R) Crossgen Commit migrated from dotnet/coreclr@6f69c6d
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes https://github.com/dotnet/coreclr/issues/4649
The immediate issues was NYI on genEmitHelperCalls. The initial
implementation for the missing part was enough to just crossgen System.dll.
But running tests revealed various issues in crossgened binaries (R2R).
Most common user/helper calls in R2R are represented as indirect calls
similar to interface call using virtual stub dispatch cell --
thunk/helper needs a indirect cell address to update the final target
address on the data location.
IsDelayLoadHelper
andIsLazyHelper
belong tothis case.
Instead of passing such parameter, x64/x86 uses an encoding trick -- it
assumes the call is dispatched like
call [addr]
. So from the returnaddress, runtime could extract indirect cell address. Unfortunately this is not
an option for arm64 (actually arm as well but I haven't fixed it in this
change) where indirect call on memory is not encodable.
So, I made the following changes:
tagged the call tree via
setR2RRelativeIndir
. Tried to be comprehensive,but I may miss something. Currently, it includes a regular call and
various helpers for (virtual) load function pointer/static data access, etc.
Hopely we change JIT/EE interface somehow that gives us such explicit information.
lower similar to VSD.
ZapIndirectHelperThunk
. In particular the immediatevalue/offset for
ldr
should be scaled down 4 times since HW will scaleit 4 times.
genEmitHelperCalls
for indirect case. This is not the case requiring indirectcell address. This is the case we inlined the indirect helper thunk for
the speed. I'm seeing the case for size opt helper call, we invoke a
direct call to such thunk which actually uses x12 to dispatch the final
target. Likewise, I used x12 for this expansion which seems a trash register that is not
overlapped with arugments with jit helpers like writer barriers.
With this change, I've tested various cases/scenraios locally.
Also I've verified all tests are passed against mscorlib.ni.dll and System.ni.dll.