-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent unwinding past FFI boundaries #46833
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
☔ The latest upstream changes (presumably #45525) made this pull request unmergeable. Please resolve the merge conflicts. |
src/librustc_mir/transform/inline.rs
Outdated
@@ -806,6 +806,7 @@ impl<'a, 'tcx> MutVisitor<'tcx> for Integrator<'a, 'tcx> { | |||
*kind = TerminatorKind::Goto { target: tgt } | |||
} | |||
} | |||
TerminatorKind::Abort => { unimplemented!("Not sure what to do here?!"); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a no-op - there are no targets to update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aka the same as TerminatorKind::Unreachable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Will fix for next version.
src/librustc_mir/build/mod.rs
Outdated
|
||
// FIXME: Figure out why we can't use something like this instead: | ||
// tcx.is_foreign_item(tcx.hir.local_def_id(fn_id)); | ||
// tcx.has_attr(tcx.hir.local_def_id(fn_id), "unwind"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you use has_attr
? Why doesn't it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, tcx.has_attr(tcx.hir.local_def_id(fn_id), "unwind")
returns false
also for "__rust_start_panic" and "panicking::rust_begin_panic".
I don't know why, all these different contexts and ids are a bit bewildering to me. My guess is that I'm trying with the wrong ID or something, but then I don't know what ID would be the right one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scratch this comment, I must have done something wrong. It does seem to work in the new - just pushed - version.
src/librustc_mir/build/mod.rs
Outdated
// Therefore generate an extra "Abort" landing pad. | ||
|
||
// FIXME: Figure out why we can't use something like this instead: | ||
// tcx.is_foreign_item(tcx.hir.local_def_id(fn_id)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_foreign_item
checks for foreign items, aka
extern "C" {
fn foreign_item(); // we don't generate MIR for this
}
Rather than extern fns, aka
extern "C" fn extern_fn() {
// we *do* generate MIR for this
}
|
src/librustc_mir/build/scope.rs
Outdated
pub fn schedule_abort(&mut self) -> BasicBlock { | ||
self.scopes[0].needs_cleanup = true; | ||
let abortblk = self.cfg.start_new_cleanup_block(); | ||
self.cfg.terminate(abortblk, self.scopes[0].source_info(self.fn_span), TerminatorKind::Abort); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line longer than 100 chars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will fix for next version.
src/librustc_mir/build/mod.rs
Outdated
// tcx.is_foreign_item(tcx.hir.local_def_id(fn_id)); | ||
// tcx.has_attr(tcx.hir.local_def_id(fn_id), "unwind"); | ||
|
||
let is_foreign = match abi { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think that an abi
check is exactly the right thing. The key point is that the "C" ABI (and other non-Rust ABIs) don't have a defined way to propagate Rust panics, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure.
"__rust_start_panic" and "panicking::rust_begin_panic" are of C ABI and are able to panic, so it can't be that undefined...
Rather, the real danger is when we tell LLVM that a function is "nounwind" and then we end up panicking within - or through - it. That's the undefined behavior this patch is trying to resolve.
So, then I was trying to figure out when we actually mark a function as "nounwind", and it seems now I did not look closely enough. The algorithm seems to be:
- ABI check - so you're right, it should be an ABI check.
- Set as unwinding if there is an unwind attribute
- Set as unwinding if it isn't a foreign item
So maybe that's what I'm supposed to mimic, or possibly try to refactor somehow if we need it in two places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it would be ideal to have the criteria extracted into a helper function, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure extracting the criteria directly would help:
- Unwinding across a lang boundary is "instant" LLVM UB as we emit the
nounwind
attribute. If we want to continue doing that, we can't also make it trap. - Therefore, we want to catch unwinding before we reach the lang boundary. That means stopping unwinding from proceeding from Rust to C, because we don't control the C-to-Rust lang boundary.
- This means that we need to prevent unwinding on non-foreign items, which means we need to ignore the code for (3) from the previous list.
Disabling the check that allows non-foreign C ABI Rust functions to unwind would allow us to make these functions abort on unwind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must be confused about something. @arielb1 what do you mean by this:
Unwinding across a lang boundary is "instant" LLVM UB as we emit the nounwind attribute. If we want to continue doing that, we can't also make it trap.
Do you mean that the call is tagged with nounwind, or the function? I was assuming the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter whether the call or the function are tagged as nounwind. In both cases, unwinding is UB LLVM-side and therefore can't be turned to an abort..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thnk we are talking past each other to a certain extent. Let me first define a set of functions I will call "border functions" -- i.e., functions implemented in Rust but which are invokable from C and hence have C ABI. For these functions, it is considered UB if they unwind (and hence these border functions may also be marked as "no-unwind"). In that case, when we generate the fn body, we can trap/abort if an unwind does occur. (This is, I believe, the same thing C++ does in such cases, though I may be mistaken.) This costs us nothing to the same extent that unwinding is "zero cost".
I guess you are saying that we should ignore the #[unwind]
attribute for the purpose of this trap, and generate it anyway? This is (I guess) because C code may still call such a function? That sort of makes sense, though it does raise the question of the purpose of the #[unwind]
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure enough. So for that we'll have to remove the "Set as unwinding if it isn't a foreign item" check.
OK, so @arielb1 and I were chatting on gitter, and we came to roughly this conclusion:
Note though that this is a change in behavior -- albeit only quasi-defined behavior -- and it feels like it ought to go through the RFC process. Still, it'd be good to have a working implementation so that we can do a crater run and assess possible impact. So it may not be that there is a "common helper" to extract, that's not entirely clear to me. |
The Abort Terminatorkind will cause an llvm.trap function call to be emitted. Signed-off-by: David Henningsson <diwic@ubuntu.com>
Generate Abort instead of Resume terminators on nounwind ABIs. rust-lang#18510 Signed-off-by: David Henningsson <diwic@ubuntu.com>
Ok, so I think we're mostly on the same page w r t what needs to be done. I rebased it on top of master and skipped the "common helper" part.
Hmm, so I was thinking "what could this possibly break" and came up with this contrived example:
But even in this case; looking at the LLVM IR, we mark this function as EDIT: So what I wanted to say - is this ever a change in behavior where the previous behavior was not UB? |
@kennytm Is there a way I can remove the "waiting on author" tag, now that I've responded and so it is no longer waiting for me (but for CI and a new review pass)? |
Btw: Not sure about the current status of Also, |
@diwic Retagged :) If this PR is no longer a work-in-progress, please also remove the "WIP" from the title. |
Why should |
src/librustc_mir/build/mod.rs
Outdated
@@ -383,6 +405,11 @@ fn construct_fn<'a, 'gcx, 'tcx, A>(hir: Cx<'a, 'gcx, 'tcx>, | |||
let source_info = builder.source_info(span); | |||
let call_site_s = (call_site_scope, source_info); | |||
unpack!(block = builder.in_scope(call_site_s, LintLevel::Inherited, block, |builder| { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stray newline
src/librustc_mir/build/mod.rs
Outdated
@@ -353,6 +354,27 @@ macro_rules! unpack { | |||
}; | |||
} | |||
|
|||
fn needs_abort_block<'a, 'gcx, 'tcx>(tcx: TyCtxt<'a, 'gcx, 'tcx>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename this to should_abort_on_panic
instead?
r=me with nits addressed |
As suggested by arielb1. Closes rust-lang#18510 Signed-off-by: David Henningsson <diwic@ubuntu.com>
Good point. FFI can call Rust functions too with the wrong calling convention, this is no worse really. Nits addressed. |
@bors r+ |
📌 Commit 4910ed2 has been approved by |
[beta] temporarily disable #46833 due to #48251 see also #48378 r? @Mark-Simulacrum
This commit is targeted at addressing rust-lang#48251 by specifically fixing a case where a longjmp over Rust frames on MSVC runs cleanups, accidentally running the "abort the program" cleanup as well. Added in rust-lang#46833 `extern` ABI functions in Rust will abort the process if Rust panics, and currently this is modeled as a normal cleanup like all other destructors. Unfortunately it turns out that `longjmp` on MSVC is implemented with SEH, the same mechanism used to implement panics in Rust. This means that `longjmp` over Rust frames will run Rust cleanups (even though we don't necessarily want it to). Notably this means that if you `longjmp` over a Rust stack frame then that probably means you'll abort the program because one of the cleanups will abort the process. After some discussion on IRC it turns out that `longjmp` doesn't run cleanups for *caught* exceptions, it only runs cleanups for cleanup pads. Using this information this commit tweaks the codegen for an `extern` function to a catch-all clause for exceptions instead of a cleanup block. This catch-all is equivalent to the C++ code: try { foo(); } catch (...) { bar(); } and in fact our codegen here is designed to match exactly what clang emits for that C++ code! With this tweak a longjmp over Rust code will no longer abort the process. A longjmp will continue to "accidentally" run Rust cleanups (destructors) on MSVC. Other non-MSVC platforms will not rust destructors with a longjmp, so we'll probably still recommend "don't have destructors on the stack", but in any case this is a more surgical fix than rust-lang#48567 and should help us stick to standard personality functions a bit longer.
rustc: Tweak funclet cleanups of ffi functions This commit is targeted at addressing rust-lang#48251 by specifically fixing a case where a longjmp over Rust frames on MSVC runs cleanups, accidentally running the "abort the program" cleanup as well. Added in rust-lang#46833 `extern` ABI functions in Rust will abort the process if Rust panics, and currently this is modeled as a normal cleanup like all other destructors. Unfortunately it turns out that `longjmp` on MSVC is implemented with SEH, the same mechanism used to implement panics in Rust. This means that `longjmp` over Rust frames will run Rust cleanups (even though we don't necessarily want it to). Notably this means that if you `longjmp` over a Rust stack frame then that probably means you'll abort the program because one of the cleanups will abort the process. After some discussion on IRC it turns out that `longjmp` doesn't run cleanups for *caught* exceptions, it only runs cleanups for cleanup pads. Using this information this commit tweaks the codegen for an `extern` function to a catch-all clause for exceptions instead of a cleanup block. This catch-all is equivalent to the C++ code: try { foo(); } catch (...) { bar(); } and in fact our codegen here is designed to match exactly what clang emits for that C++ code! With this tweak a longjmp over Rust code will no longer abort the process. A longjmp will continue to "accidentally" run Rust cleanups (destructors) on MSVC. Other non-MSVC platforms will not rust destructors with a longjmp, so we'll probably still recommend "don't have destructors on the stack", but in any case this is a more surgical fix than rust-lang#48567 and should help us stick to standard personality functions a bit longer.
Rust 1.24 made it so that a panic!() won't unwind across FFI boundaries: we won't unwind if we panic inside a Rust function declared extern "C". rust-lang/rust#46833 However, gnome-class does not formally mandate a particular Rust version; we've been running on the assumption that we are running on nightly, or "recent enough". For now, just supress the deprecation warnings from glib-rs, until we formalize our requirements for the rustc version.
Second attempt to write a patch to solve this.
r? @nikomatsakis
So, my biggest issue with this patch is the way the patch determines what functions should have an abort landing pad (inFIXEDconstruct_fn
). I would ideally have this code match src/librustc_trans/callee.rs::get_fn but couldn't find an id that returns true foris_foreign_item
. Also triedtcx.has_attr("unwind")
with no luck.Other issues:
llvm.trap is an SIGILL on amd64. Ideally we could use panic-abort's version of aborting which is nicer but we don't want to depend on that library...
Mir inlining is a stub currently.FIXED (no-op)Also, when reviewing please take into account that I'm new to the code and only partially know what I'm doing... and that I've mostly made made matches on
TerminatorKind::Abort
match eitherTerminatorKind::Resume
orTerminatorKind::Unreachable
based on what looked best.