Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [CodeGen] Enable TrapUnreachable by default for all targets. #109732

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

duk-37
Copy link
Contributor

@duk-37 duk-37 commented Sep 23, 2024

Currently, several existing platforms (ARM, AArch64, WASM, potentially PPC) enable TrapUnreachable by default in various scenarios. Attempt to enable it by default for most targets.

At the moment, it is still trivial to accidentally generate either missed branches or completely empty functions through various means. This is not just an issue for the end users, but for compiler developers too, as it can make debugging some issues very painful. Existing testing by Rust developers several years ago and a forum user has shown that the binary size impact is negligible with NoTrapAfterNoreturn kept at false -- however, it is important to note that some significant (~0.2%) code size changes in Clang. As such, we compromise and default it to true. With a current build of LLVM, this patch causes roughly a 0.025% decrease in the size of the text section of Clang. My current working theory is that the explicit TRAP lets some machine passes optimize some useless MachineInstructions away. This is supported by one of the tests (X86/pr32484.ll) being partially optimized away if TrapUnreachable is set, along with block count reductions in some other tests.

It is important to note that this does not prevent optimizations around unreachable -- it simply turns unreachable instructions that exist when instruction selection is invoked into trap representations rather than skipping code generation completely. If these unreachable instructions exist, it is either after a noreturn call (assuming NoTrapAfterNoreturn has been set), in which case it's in a cold path to begin with, or it's in a buggy block of code.

I'm not quite sure what to do about architectures that don't have hardware traps and don't like external calls to abort, namely eBPF and AMDGPU. In addition, Hexagon seems to do some trickery that causes EH_RETURN to miscompile when TrapUnreachable is set. For now, we cowardly disable TrapUnreachable by default on these targets. Fixing these targets feels like something for a different patch, this one is big enough as-is.

This patch:

  • Changes the default values of TrapUnreachable and NoTrapAfterNoreturn to true

  • Ensures that some existing code sets NoTrapAfterNoreturn to preserve current behavior

  • Does some minor code cleanup to remove non-load-bearing target-specific behavior now subsumed by the defaults, and

  • Updates various tests as necessary and adds some new ones. Importantly, we disable TrapUnreachable in certain tests where generation of an empty function is necessary to test certain behavior.

Marking this as a draft for now as some tests still fail, I'm not entirely confident in all the test changes, and I'd appreciate some feedback about whether this change would be welcome in general. I don't think it's big enough to require an RFC but there are definitely some challenges around it. This PR also depends on #109560, #109730, and #109744 (or some equivalents) landing.

Closes #32380
Closes #48943
Closes #60596
Closes #60622
Closes #67732
Closes #75797
Closes #87836

Copy link

github-actions bot commented Sep 23, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff eca5949031c31fe5ff5ad7a5df07bbce13379108 5ce7a94ede6c271c586a2510f17fe7705d5ab42c --extensions cpp,h -- llvm/include/llvm/Target/TargetOptions.h llvm/lib/CodeGen/LLVMTargetMachine.cpp llvm/lib/Target/AArch64/AArch64TargetMachine.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp llvm/lib/Target/ARM/ARMTargetMachine.cpp llvm/lib/Target/BPF/BPFTargetMachine.cpp llvm/lib/Target/Hexagon/HexagonTargetMachine.cpp llvm/lib/Target/X86/X86TargetMachine.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/CodeGen/LLVMTargetMachine.cpp b/llvm/lib/CodeGen/LLVMTargetMachine.cpp
index 8fcd5594f9..744d0b36e9 100644
--- a/llvm/lib/CodeGen/LLVMTargetMachine.cpp
+++ b/llvm/lib/CodeGen/LLVMTargetMachine.cpp
@@ -33,9 +33,9 @@
 #include "llvm/Target/TargetOptions.h"
 using namespace llvm;
 
-cl::opt<bool> EnableTrapUnreachable(
-    "trap-unreachable", cl::Hidden,
-    cl::desc("Enable generating trap for unreachable"));
+cl::opt<bool>
+    EnableTrapUnreachable("trap-unreachable", cl::Hidden,
+                          cl::desc("Enable generating trap for unreachable"));
 
 cl::opt<bool> EnableNoTrapAfterNoreturn(
     "no-trap-after-noreturn", cl::Hidden,

@arsenm
Copy link
Contributor

arsenm commented Sep 24, 2024

I'm not quite sure what to do about architectures that don't have hardware traps and don't like external calls to abort, namely eBPF and AMDGPU

AMDGPU does support traps, it just doesn't involve a call to abort. llvm.trap works fine

@arsenm
Copy link
Contributor

arsenm commented Sep 24, 2024

I don't think it's big enough to require an RFC but there are definitely some challenges around it.

I think it is

@duk-37 duk-37 force-pushed the trap-unreachable-for-all branch from be690f7 to 5ce7a94 Compare September 24, 2024 11:36
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more reasonable to change the default at -O0, and leave it as-is with optimizations enabled. If it's worth tracking down the marginal improvements you noticed, we could add an unreachable pseudo MachineInstr

Comment on lines 616 to +620
TLOF(createTLOF(getTargetTriple())) {
// FIXME: There are some scenarios where targets may not have hardware traps,
// and external calls to `abort` also fail. For now, do a blanket-disable.
if (!EnableTrapUnreachable.getNumOccurrences())
this->Options.TrapUnreachable = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Targets should not have to flip bits in this setting like this

Copy link
Contributor Author

@duk-37 duk-37 Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already happening all over the place though. See PS4/PS5 in X86, WASM, AArch64, etc. That being said, it definitely isn't great.

@duk-37
Copy link
Contributor Author

duk-37 commented Sep 24, 2024

I'm not quite sure what to do about architectures that don't have hardware traps and don't like external calls to abort, namely eBPF and AMDGPU

AMDGPU does support traps, it just doesn't involve a call to abort. llvm.trap works fine

It worked in most cases, but I was running into one test-case where a call to abort was generated and ISel was panicking. I'll look a bit more into it.

@duk-37
Copy link
Contributor Author

duk-37 commented Sep 25, 2024

I'm not quite sure what to do about architectures that don't have hardware traps and don't like external calls to abort, namely eBPF and AMDGPU

AMDGPU does support traps, it just doesn't involve a call to abort. llvm.trap works fine

It worked in most cases, but I was running into one test-case where a call to abort was generated and ISel was panicking. I'll look a bit more into it.

Update: This seems to be specific to the R600 backend and only one test fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment