-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unreachable propagation mir optimization pass #66329
Add unreachable propagation mir optimization pass #66329
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cramertj (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
r? @oli-obk |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice first contribution! 🎉
|
||
for block in unreachable_blocks { | ||
body.basic_blocks_mut()[block] | ||
.terminator.as_mut().unwrap().kind = TerminatorKind::Unreachable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use BasicBlockData::terminator_mut()
here to avoid the .as_mut().unwrap()
} else { | ||
let mut bb_successors = terminator.successors().peekable(); | ||
|
||
if bb_successors.peek().is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, why this condition? Definitionally, a terminator which cannot end up anywhere means that it is unreachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the else branch and that it is redundant as we have the successors check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all() is true if it's empty alas, so we added a check that it wasn't empty first. Is there a cuter way to say all_but_not_empty()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that the unwritten else-branch here could do the same as the if-branch, and so the if bb_successors.peek().is_some()
is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gilescope But why do we want to differentiate between the empty and non-empty case? By definition, a branch is only reachable if it has any reachable successors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind; this is wrong since this would transform e.g. Abort
to Unreachable
(there is a successor of sorts in this case, it's just not encoded in MIR...). Can you add a comment explaining why the check exists?
I don't believe this optimization is sound as currently implemented because calls can diverge while having only unreachable successors. As an example, consider this program: fn loop_forever() { loop {} }
pub enum Empty {}
pub unsafe fn foo(x: bool, bomb: *const Empty) {
if x {
loop_forever()
}
match *bomb {}
} Leaving aside landing pads that may exist at the point in the pipeline where the pass currently runs (more about that later), the MIR CFG of
As I understand it, the pass will mark all three basic blocks as unreachable (RPO is bb2, bb1, bb0 and when visited in that order bb1 and bb0 have only unreachable successors). But when Now, I see that the pass has been placed before But in any case, it's very fragile to rely on pass ordering to such an extent, especially if it's not documented anywhere. In addition, there may be statements that can diverge (inline asm is currently an example, though unstable) and similar issues apply there. So I think this pass needs some analysis to tell whether a basic block is known to reach its successors for sure, and it should be written to be conservatively correct, i.e., err on the side of falsely concluding that a basic block may diverge before reaching its successor. This may be a candidate for a generally useful MIR analysis, but it can also start as a one-off thing local to this pass. |
We should definitely add a mir-opt test ensuring this isn't mis-compiled.
Ugh; that seems ungreat. From what I can tell, inline asm is the only statement that can affect control flow -- there doesn't seem to be anything else. (Why is It seems to me that we would be on the safe side by limiting this optimization to |
Note that inline asm can't affect control flow of the surrounding program in the typical sense: if any MIR instruction (statement or terminator) is executed after an inline asm, it is the one that comes next in the same basic block. Inline asm can only diverge by "never finishing" (e.g. by looping internally or terminating the program), not e.g. by returning or unwinding or jumping to a different MIR instructions. It also isn't a natural stopping point for basic blocks the way e.g. Inline asm isn't special in this respect, either. There are various operations that might "never finish" but could plausibly be statements (e.g. calls that can't unwind, a variant of The only way to justify making these kinds of statements into terminator is if we decide that MIR basic blocks should not only have the classical properties of basic blocks, but furthermore guarantee that execution of the terminator is guaranteed (absent external termination of the program, of course) when the basic block is entered. This seems like a strange property, as it's annoying to maintain (you have to convert things which could otherwise be statements into terminators and eat the costs of the extra basic blocks) and it's only of limited use to a few analyses which can also be written very reasonably in other ways.
|
Also at RustFest @oli-obk mentored me through a related mir optimisation pass which replaced The only reason I can potentially think of for keeping the passes separate is if we always want to replace the The WIP branch for that pass can be found here. To integrate it into this PR one could do something like the following instead of the existing
The removal of the codegen for the unreachable intrinsic (in |
We could be conservative if we found a block containing inline asm. But diverging statements sounds suspiciously like trying to solve the haulting problem... can we enumerate the problem statement types or would that be too risky? |
It's not hard at all to make a useful classification of statements and terminators into "can possibly diverge" vs "definitely can't diverge" (in contrast, all the classical results of computability theory concern decision problems like "definitely diverges" vs "definitely halts"). |
Great - I think one of the other optimisations @oli-obk was discussing was classifying statements that were definitely pure r-values so that we could then remove those statements (I assume following an unreachable?). Anything that might diverge would not be a pure r-value? |
Ping from triage |
766ca3d
to
0300429
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
@JohnCSimon working on it. Last build was started by accident. |
ping from triage @ktrianta any updates on this? no hurry though since it's the holiday season. |
@ktrianta let me know if you want a hand. Wasn't quite sure how one helps in a PR - does one fork your fork and propose a PR (that you could choose to accept into your PR)? |
Yes, that is the way to help with a PR. Just make sure that you create a new branch from the branch of this PR and propose to merge your branch into the branch of this PR, and not master. |
0300429
to
3d583cf
Compare
💔 Test failed - checks-azure |
looks like spurious network errors and crates.io 500/503 errors @bors retry |
⌛ Testing commit 72710d6 with merge 3fe5b3728515aa7232c1198503606092977dad60... |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
💔 Test failed - checks-azure |
|
@bors retry |
…ation, r=oli-obk Add unreachable propagation mir optimization pass @oli-obk suggested we create a MIR pass that optimizes away basic blocks that lead only to basic blocks with terminator kind **unreachable**. This is a first take on this, which we started with @gilescope at RustFest Impl Days. The test currently fails when the compiled program runs (undefined behaviour). Is there a way to avoid running the compiled program?
…i-obk Add unreachable propagation mir optimization pass @oli-obk suggested we create a MIR pass that optimizes away basic blocks that lead only to basic blocks with terminator kind **unreachable**. This is a first take on this, which we started with @gilescope at RustFest Impl Days. The test currently fails when the compiled program runs (undefined behaviour). Is there a way to avoid running the compiled program?
☀️ Test successful - checks-azure |
@oli-obk suggested we create a MIR pass that optimizes away basic blocks that lead only to basic blocks with terminator kind unreachable. This is a first take on this, which we started with @gilescope at RustFest Impl Days.
The test currently fails when the compiled program runs (undefined behaviour). Is there a way to avoid running the compiled program?