Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exception handling in the prestub worker - take 2 #112666

Merged
merged 5 commits into from
Feb 20, 2025

Conversation

janvorli
Copy link
Member

There is a bug in the new exception handling when ThePreStub is called
from CallDescrWorkerInternal and the exception is propagated through
that. One of such cases is when a managed class is being initialized
during JITting and the constructor is in an assembly that's not found.
The bug results in skipping all the native frames upto a managed frame
that called that native chain that lead to the exception. In the
specific case I've mentioned, a lock in the native code is left in
locked state. That later leads to a hang. This was case was observed
with Roslyn invoking an analyzer where one of the dependencies was
missing.

The fix is to ensure that when ThePreStub is called by
CallDescrWorkerInternal, the exception is not caught in the
PreStubWorker. It is left flowing into the native calling code instead.

The same treatment is applied to ExternalMethodFixupWorker and
VSD_ResolveWorker too.

On Windows, we also need to prevent the ProcessCLRException invocation
to call into the managed exception handling code.

There is a bug in the new exception handling when ThePreStub is called
from CallDescrWorkerInternal and the exception is propagated through
that. One of such cases is when a managed class is being initialized
during JITting and the constructor is in an assembly that's not found.
The bug results in skipping all the native frames upto a managed frame
that called that native chain that lead to the exception. In the
specific case I've mentioned, a lock in the native code is left in
locked state. That later leads to a hang. This was case was observed
with Roslyn invoking an analyzer where one of the dependencies was
missing.

The fix is to ensure that when ThePreStub is called by
CallDescrWorkerInternal, the exception is not caught in the
PreStubWorker. It is left flowing into the native calling code instead.

The same treatment is applied to ExternalMethodFixupWorker and
VSD_ResolveWorker too.

On Windows, we also need to prevent the ProcessCLRException invocation
to call into the managed exception handling code.
I was hitting intermittent crashes in the unhandled exception test
with GC stress C enabled due to this when GetSSP tried to access the
SSP in the extended part of the context.
@janvorli janvorli requested a review from jkotas February 18, 2025 22:09
@janvorli janvorli self-assigned this Feb 18, 2025
@Copilot Copilot bot review requested due to automatic review settings February 18, 2025 22:09
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 2 out of 12 changed files in this pull request and generated 1 comment.

Files not reviewed (10)
  • src/coreclr/vm/excep.cpp: Language not supported
  • src/coreclr/vm/exceptionhandling.cpp: Language not supported
  • src/coreclr/vm/exceptmacros.h: Language not supported
  • src/coreclr/vm/prestub.cpp: Language not supported
  • src/coreclr/vm/virtualcallstub.cpp: Language not supported
  • src/tests/Regressions/coreclr/GitHub_76531/dependencytodelete.csproj: Language not supported
  • src/tests/Regressions/coreclr/GitHub_76531/tailcallinvoker.il: Language not supported
  • src/tests/Regressions/coreclr/GitHub_76531/tailcallinvoker.ilproj: Language not supported
  • src/tests/Regressions/coreclr/GitHub_76531/test76531.csproj: Language not supported
  • src/tests/issues.targets: Language not supported
Comments suppressed due to low confidence (1)

src/tests/Regressions/coreclr/GitHub_76531/test76531.cs:25

  • [nitpick] The error message 'My exception' is unclear. Consider providing a more descriptive message.
=> throw new Exception("My exception");

{
if ((pExceptionRecord->ExceptionFlags & EXCEPTION_UNWINDING))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas this is the only change w.r.t. the previous PR that was reverted recently. This popping was causing the GC to get lost.

@jkotas
Copy link
Member

jkotas commented Feb 19, 2025

The test failure looks related:

Unhandled exception. System.OverflowException: Arithmetic operation resulted in an overflow.
   at Microsoft.Win32.OAVariantLib.OAFailed(Int32 hr)
   at Microsoft.Win32.OAVariantLib.ChangeType(Object source, Type targetClass, Int16 options, CultureInfo culture)
   at System.OleAutBinder.ChangeType(Object value, Type type, CultureInfo cultureInfo)

* The previous set of changes removed popping of ExInfos too. That's not
  correct though. But it should be done in a different place than the
  ProcessCLRExceptionNew.
* There was a problem (even before the fix) that an exception caught in
  DispatchInfo::InvokeMember was reported (via console log and to the
  debugger) as unhandled.

This change also adds a new flavor of the unhandled exception test that
throws an unhandled exception on a secondary thread to exercise the
related code path.
@janvorli
Copy link
Member Author

@jkotas I have fixed the test failure. There were actually two issues. One was that the exception was reported as unhandled even though it was actually then handled in native runtime code in the DispatchInfo::InvokeMember. The other was that the ExInfo popping was still needed and that lead to the test hang (in an interesting way - the next ExInfo that was pushed by the next portion of the test had accidentally the same address, so the presence of previous stale ExInfo link at the same address created an infinite loop in the m_pPrevNestedInfo chain.

@janvorli janvorli merged commit 2cb402c into dotnet:main Feb 20, 2025
97 of 100 checks passed
@janvorli janvorli deleted the fix-exception-from-class-init-hang-3 branch February 20, 2025 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants