Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen] Avoid generating trap instructions after exception restore intrinsics #109560

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

duk-37
Copy link
Contributor

@duk-37 duk-37 commented Sep 22, 2024

Right now, PrologEpilogInsertion uses MBB.isReturnBlock() to determine whether or not to restore CSRs. This normally works with EH_RETURN since the latter is marked as being a return instruction; however, the situation gets more complicated when trap-unreachable is involved since it's no longer the last instruction in the block. As such, with trap-unreachable enabled, PEI misses our block and does not emit a restore when necessary. This also causes a MachineVerifier failure as traps are not considered terminators.

As there are potentially other, similar issues in other passes here, we fix this by marking llvm.eh.return as noreturn and then adding an additional check in SelectionDAG and GlobalISel before emitting trap instructions when visiting unreachable.

Right now, PrologEpilogInsertion uses `MBB.isReturnBlock()` to determine whether or not to restore CSRs. This normally works with `EH_RETURN` since the latter is marked as being a return instruction; however, the situation gets more complicated when `trap-unreachable` is involved since it's no longer the last instruction in the block. As such, with `trap-unreachable` enabled, PEI misses our block and does not emit a restore when necessary. This also causes a MachineVerifier failure as traps are not considered terminators.

As there are potentially other, similar issues in other passes here, we fix this by marking `llvm.eh.return` as `noreturn` and then adding an additional check in SelectionDAG and GlobalISel before emitting trap instructions when visiting `unreachable`.
@llvmbot
Copy link
Member

llvmbot commented Sep 22, 2024

@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-x86

Author: duk (duk-37)

Changes

Right now, PrologEpilogInsertion uses MBB.isReturnBlock() to determine whether or not to restore CSRs. This normally works with EH_RETURN since the latter is marked as being a return instruction; however, the situation gets more complicated when trap-unreachable is involved since it's no longer the last instruction in the block. As such, with trap-unreachable enabled, PEI misses our block and does not emit a restore when necessary. This also causes a MachineVerifier failure as traps are not considered terminators.

As there are potentially other, similar issues in other passes here, we fix this by marking llvm.eh.return as noreturn and then adding an additional check in SelectionDAG and GlobalISel before emitting trap instructions when visiting unreachable.


Full diff: https://github.com/llvm/llvm-project/pull/109560.diff

5 Files Affected:

  • (modified) llvm/include/llvm/IR/Intrinsics.td (+2-2)
  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+8)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+8)
  • (added) llvm/test/CodeGen/X86/eh-trap-unreachable-x86.ll (+15)
  • (added) llvm/test/CodeGen/X86/eh-trap-unreachable-x86_64.ll (+15)
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 0a74a217a5f010..5e4ee4276aa97f 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -1405,8 +1405,8 @@ let IntrProperties = [IntrNoMem, IntrSpeculatable, IntrWillReturn] in {
 // given function it is 'const' and may be CSE'd etc.
 def int_eh_typeid_for : Intrinsic<[llvm_i32_ty], [llvm_anyptr_ty], [IntrNoMem]>;
 
-def int_eh_return_i32 : Intrinsic<[], [llvm_i32_ty, llvm_ptr_ty]>;
-def int_eh_return_i64 : Intrinsic<[], [llvm_i64_ty, llvm_ptr_ty]>;
+def int_eh_return_i32 : Intrinsic<[], [llvm_i32_ty, llvm_ptr_ty], [IntrNoReturn]>;
+def int_eh_return_i64 : Intrinsic<[], [llvm_i64_ty, llvm_ptr_ty], [IntrNoReturn]>;
 
 // eh.exceptionpointer returns the pointer to the exception caught by
 // the given `catchpad`.
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 8e860a1f740295..7cf6e4d55c24a9 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -3125,6 +3125,14 @@ bool IRTranslator::translateUnreachable(const User &U, MachineIRBuilder &MIRBuil
     // Do not emit an additional trap instruction.
     if (Call->isNonContinuableTrap())
       return true;
+    // Do not emit trap instructions after EH_RETURN intrinsics.
+    // This can cause problems later down the line when other machine passes
+    // attempt to use the last instruction in a BB to determine terminator behavior.
+    if (const auto *II = dyn_cast<IntrinsicInst>(Call)) {
+      const auto IID = II->getIntrinsicID();
+      if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64)
+        return true;
+    }
   }
 
   MIRBuilder.buildTrap();
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 25213f587116d5..8ca1aa0278f127 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3553,6 +3553,14 @@ void SelectionDAGBuilder::visitUnreachable(const UnreachableInst &I) {
     // Do not emit an additional trap instruction.
     if (Call->isNonContinuableTrap())
       return;
+    // Do not emit trap instructions after EH_RETURN intrinsics.
+    // This can cause problems later down the line when other machine passes
+    // attempt to use the last instruction in a BB to determine terminator behavior.
+    if (const auto *II = dyn_cast<IntrinsicInst>(Call)) {
+      const auto IID = II->getIntrinsicID();
+      if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64)
+        return;
+    }
   }
 
   DAG.setRoot(DAG.getNode(ISD::TRAP, getCurSDLoc(), MVT::Other, DAG.getRoot()));
diff --git a/llvm/test/CodeGen/X86/eh-trap-unreachable-x86.ll b/llvm/test/CodeGen/X86/eh-trap-unreachable-x86.ll
new file mode 100644
index 00000000000000..1101a89f68b0b4
--- /dev/null
+++ b/llvm/test/CodeGen/X86/eh-trap-unreachable-x86.ll
@@ -0,0 +1,15 @@
+; RUN: llc < %s -mtriple=i386-unknown-unknown -trap-unreachable -no-trap-after-noreturn=false -verify-machineinstrs | FileCheck %s
+
+;; Ensure we restore caller-saved registers before EH_RETURN, even with trap-unreachable enabled.
+;; For now, we deliberately avoid generating traps altogether.
+
+; CHECK-LABEL: test32
+; CHECK: pushl
+; CHECK: popl
+; CHECK: eh_return
+; CHECK-NOT: ud2
+define void @test32(i32 %offset, ptr %handler) {
+  call void @llvm.eh.unwind.init()
+  call void @llvm.eh.return.i32(i32 %offset, ptr %handler)
+  unreachable
+}
diff --git a/llvm/test/CodeGen/X86/eh-trap-unreachable-x86_64.ll b/llvm/test/CodeGen/X86/eh-trap-unreachable-x86_64.ll
new file mode 100644
index 00000000000000..d95bd610b5622f
--- /dev/null
+++ b/llvm/test/CodeGen/X86/eh-trap-unreachable-x86_64.ll
@@ -0,0 +1,15 @@
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -trap-unreachable -no-trap-after-noreturn=false -verify-machineinstrs | FileCheck %s
+
+;; Ensure we restore caller-saved registers before EH_RETURN, even with trap-unreachable enabled.
+;; For now, we deliberately avoid generating traps altogether.
+
+; CHECK-LABEL: test64
+; CHECK: pushq
+; CHECK: popq
+; CHECK: eh_return
+; CHECK-NOT: ud2
+define void @test64(i64 %offset, ptr %handler) {
+  call void @llvm.eh.unwind.init()
+  call void @llvm.eh.return.i64(i64 %offset, ptr %handler)
+  unreachable
+}

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 2c770675ce36402b51a320ae26f369690c138dc1 3daa4022a395ba939b26fc6f239b89a878456e27 --extensions cpp -- llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 7cf6e4d55c..02f4d52738 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -3127,7 +3127,8 @@ bool IRTranslator::translateUnreachable(const User &U, MachineIRBuilder &MIRBuil
       return true;
     // Do not emit trap instructions after EH_RETURN intrinsics.
     // This can cause problems later down the line when other machine passes
-    // attempt to use the last instruction in a BB to determine terminator behavior.
+    // attempt to use the last instruction in a BB to determine terminator
+    // behavior.
     if (const auto *II = dyn_cast<IntrinsicInst>(Call)) {
       const auto IID = II->getIntrinsicID();
       if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 8ca1aa0278..ebae2517a5 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3555,7 +3555,8 @@ void SelectionDAGBuilder::visitUnreachable(const UnreachableInst &I) {
       return;
     // Do not emit trap instructions after EH_RETURN intrinsics.
     // This can cause problems later down the line when other machine passes
-    // attempt to use the last instruction in a BB to determine terminator behavior.
+    // attempt to use the last instruction in a BB to determine terminator
+    // behavior.
     if (const auto *II = dyn_cast<IntrinsicInst>(Call)) {
       const auto IID = II->getIntrinsicID();
       if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64)

@duk-37 duk-37 requested review from huntergr-arm and kovdan01 and removed request for huntergr-arm September 22, 2024 04:50
Comment on lines +1408 to +1409
def int_eh_return_i32 : Intrinsic<[], [llvm_i32_ty, llvm_ptr_ty], [IntrNoReturn]>;
def int_eh_return_i64 : Intrinsic<[], [llvm_i64_ty, llvm_ptr_ty], [IntrNoReturn]>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you still need to special case this in the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused as to what you mean. The Clang intrinsic, __builtin_eh_return is marked as NoReturn, everything essentially treats it as noreturn due to unreachable being emitted immediately after them, and the verifier would complain if you emitted code after them (which is exactly what this PR is trying to fix).

Marking these as noreturn accurately represents their behavior and makes it so that the patch can be a bit smaller since I don't need to reorganize logic in the instruction selectors.

Copy link
Contributor

@arsenm arsenm Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IR intrinsic noreturn attribute is separate from the machine verifier issue you are seeing. You can do this as a pre-commit (which is independently observable in IR optimizations).

You're also not making use of this attribute below?

Comment on lines +3131 to +3135
if (const auto *II = dyn_cast<IntrinsicInst>(Call)) {
const auto IID = II->getIntrinsicID();
if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64)
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These intrinsics are missing from the langref and I don't know what's expected of them. Should these really be terminators in the first place? Can the insert point just be adjusted?

use the last instruction in a BB to determine terminator behavior.

This comment doesn't exactly make sense

Copy link
Contributor Author

@duk-37 duk-37 Sep 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these really be terminators in the first place? Can the insert point just be adjusted?

Yes and no, respectively. They're old exception-handling intrinsics that lower down to ISD::EH_RETURN through the SelectionDAG, which is then lowered to target-specific EH_RETURN opcodes across different architectures. These opcodes are all marked as "return instructions" in various .td files.

The problem becomes that these are not actually return instructions, they're intrinsics. When they're lowered, we visit an unreachable instruction after the EH_RETURN opcode is emitted through the intrinsic, meaning that if trap-unreachable is enabled we emit a TRAP instruction after a terminator. You can see this behavior here: https://godbolt.org/z/5br41sWqf

We therefore also get blocks where MachineBasicBlock::isReturnBlock() returns false when it would have returned true before, as the function checks the last instruction in a block to figure out whether it's a "return block" or not. This is used, for example, in PrologEpilogInsertion to determine where to restore registers.

I'm not quite sure if this is a miscompile, but returning false in isReturnBlock is definitely not what we want to happen here. Also, the restore behavior is explicitly checked in tests, so I'm not going to bank on that. My goal here is essentially to just preserve the current behavior (and not crash in the machine verifier) when trap-unreachable is set to true.

Looking at this again, I'm not sure these intrinsics are even supported in GlobalISel on any architecture, but it would be good to keep this consistent between instruction selection backends.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be another case where we should move intrinsics that are really terminators into callbr or be first class terminators

// Do not emit trap instructions after EH_RETURN intrinsics.
// This can cause problems later down the line when other machine passes
// attempt to use the last instruction in a BB to determine terminator behavior.
if (const auto *II = dyn_cast<IntrinsicInst>(Call)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be handled in isNonContinuableTrap?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants