Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeAOT fast fails process from hw exception after winrt loaded by external code #359

Closed
emily33901 opened this issue Nov 16, 2020 · 4 comments
Labels
area-NativeAOT-coreclr .NET runtime optimized for ahead of time compilation

Comments

@emily33901
Copy link

Hi,

I managed to reencounter the issue that I mentioned in #182 where winrt is loaded by libcef which loads twinapi which loads combase. Im not sure entirely the reason for this. The thread in question is at points hijacked by some NativeAOT code but at the point where the crash occurs there is no NativeAOT code in the callstack.

Exception thrown at 0x00007FFD24A73E49 (KernelBase.dll) in <process>.exe: WinRT originate error - 0x80070005 : 'Access is denied.'.
Unhandled exception at 0x00007FFD24B5B65C (KernelBase.dll) in <process>.exe: 0x00001001.

This HW exception is then picked up by RhpThrowEx which goes and fast fails the process.

the top of the callstack is as follows

The locals of RhThrowHwEx are as follows

I suspect that its because its attempting to load winrt when thats not allowed by NativeAOT (once again thats not by me but by libcef which is used by the process my dll resides in).

errco.de suggests that a 0x00001001 is a stack overflow but I cant belive that is the case since there are only 50 or so functions on the stack...

Thanks,
Em

@jkotas
Copy link
Member

jkotas commented Nov 16, 2020

Here are a few things to try:

  • You should be able to get symbols for the Windows .dlls. In Visual Studio, go to Debug/Options/Debugging/Symbols and check "Microsoft Symbol Servers". Right click on the lines in stacktrace to download the symbols if they do not download automatically. Having stacktrace with symbols will give us better idea what's going on.
  • Could you please go to disassembly Window and look at what the assembly code looks like in the place that go us to RhpThrowHwEx (it is 7ffd24a73e49 in the screenshot above)
  • The only way to go to RhpVectoredExceptionHandler is via RhpVectoredExceptionHandler. It looks like that we somehow found code manager for code that does not belong to the runtime inside (FindCodeManagerByAddress returned valid code manager). Could you please check the ranges inside g_pTheRuntimeInstance->m_CodeManagerList to see whether they include the IP address where the fault occurred (it is 7ffd24a73e49).

One possible explanation of the crash is that there is a unmanaged heap corruption caused by unrelated code that is corrupting g_pTheRuntimeInstance->m_CodeManagerList. You can try running your application with pageheap to see whether it is going to find any problems. https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/gflags-and-pageheap

@emily33901
Copy link
Author

Hi,

Thanks for the advice, Here is the stack trace with the symbols I have gotten

From this its certainly obvious that WinRT is raising an error that is then being caught by NativeAOT. I can say to this extent that I know that NativeAOT has certainly had some interaction with the thread in general (running code on it and similar).

The assembly that leads to RhpThrowHwEx is from RaiseException in KernelBase.dll

Which as i said above makes sense to why it might lead to RhpThrowHwEx

I cant get at the g_pTheRuntimeInstance with my current setup so ill try and do some work to get it included in my debug information.

I will retry at a later point with gflags because that is certainly a good point that I do need to check on anyway. Do you know whether the application verifier flags will work fine in this case?

Thanks

@jkotas
Copy link
Member

jkotas commented Nov 16, 2020

From this its certainly obvious that WinRT is raising an error that is then being caught by NativeAOT

Yes, the NativeAOT subscribes to all exceptions. It should filter out the ones that does not belong to it, but it is what's not happening in this case.

I suspect that the problem will be that the ranges inside g_pTheRuntimeInstance->m_CodeManagerList are wrong or corrupted for some reason.

I cant get at the g_pTheRuntimeInstance with my current setup

You may need to reference it as yourdllname!g_pTheRuntimeInstance.

Do you know whether the application verifier flags will work fine in this case?

Yes, you should be able to turn on the heap verification using application verifier. gflags.exe tool that comes with windbg can be used to turn on the pageheap as well.

@jkotas jkotas added the area-NativeAOT-coreclr .NET runtime optimized for ahead of time compilation label Nov 18, 2020
@jkotas
Copy link
Member

jkotas commented Dec 14, 2020

Closing inactive issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-NativeAOT-coreclr .NET runtime optimized for ahead of time compilation
Projects
None yet
Development

No branches or pull requests

2 participants