-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for full RELRO #43170
Add support for full RELRO #43170
Conversation
This commit adds support for full RELRO, and enables it for the platforms I know have support for it. Full RELRO makes the PLT+GOT data read-only on startup, preventing it from being overwritten. http://tk-blog.blogspot.com/2009/02/relro-not-so-well-known-memory.html Fixes rust-lang#29877. Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @eddyb (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
As I noted in #29877 (comment), I do add these flags on Fedora and EPEL7 (for RHEL7). However, I encountered bizarre linking bugs with this on RHEL6 for a few architectures -- I think that was on powerpc64 specifically. So that makes me really hesitate whether this should be the default behavior of |
Thanks for the PR @kyrias! This seems reasonable to me to try to set these by default, but @cuviper's concerns worry me! @cuviper you wouldn't happen still have any of those link error logs on hand, would you? @kyrias do you know when these linker flags were introduced? One concern on the related issue is that not all linkers may support this, but my guess is that all recent-ish linkers should probably support it, just curious how far back it goes! |
I don't have those old logs anymore. I'm trying to reproduce it anew... |
GNU binutils seems to have added them in 2.16 which was released in 2005, and I'm not sure if it'd be worth it to support 12+ year old linkers. All Linux and BSD targets we want to support should be fine with this as it is. It's not enabled on e.g. Solaris though, since they use their own linker which doesn't seem to have support for RELRO. |
For comparison glibc 2.5 (the oldest one we support) was released on 2006-09-29, so I think we're probably covered here! Let's just wait and see if @cuviper's reproduction brings up something new, otherwise this seems fine to merge to me. |
Is there any worry about increased program load times with this? |
So, with full RELRO there is a slight performance impact when it resolves all the symbols, but in practice it's pretty much not noticeable unless the program has a very large number of symbols. |
It would possibly make sense to add a compiler flag to disable full BIND_NOW for specific packages though, so you'd end up with partial RELRO without the performance impact of BIND_NOW. |
Since Rust's kernel/glibc baseline is basically RHEL5, we can note that has binutils-2.17 too. Also notable, since 2.27 I'm still working on the reproducing my earlier issues. I couldn't get it manually, but I just found the git history of my package scripts where I turned it off, so I'm trying to build right before that again now. |
OK, not a link error, "just" a segfault:
I don't remember how I connected that to RELRO/BIND_NOW, and this attempt now was in the build farm so I don't actually have access to those failed artifacts. I'll start another build where I actually have a shell, using current master. |
@cuviper oh thanks for that! I wonder, does that mean that we should perhaps pass |
I finally found my notes from before! We determined this to be a bug in the dynamic loader (ld.so), but sadly that bugzilla got marked private. I'll see if we can get a public bug opened... I can still reproduce the issue while building 1.18, but 1.19-beta and later seem to run OK. I'm not especially confident in that though, given that the bug wasn't really rust's fault, so that "fix" may be a fluke. |
I think it would make sense to enable both for everything other than the ones binutils don't have it enabled for, and ppc64. @cuviper Does it reproduce with partial RELRO as well, or just full RELRO? |
My bug just got closed as a duplicate of this one which is public: And indeed in rustc's case, it's a similar
Using |
Any chance you could backport that commit to the glibc version you're using and seeing if that actually fixes it or not? If it does in fact fix it, I'm not sure what the best way forward would be. It would sort of suck to not be able to enable it due to a bug that got fixed 6 years ago, but if rustc wants to support RHEL5 as a baseline, which is 10 years old by now, I'm sort of uncertain. I think it might be a good idea to have it on by default, and then have a built-time option for very out-of-date operating systems to disable it in that case. Anyway, I could probably start by reworking this PR so that it treats partial and full RELRO separately, so that it would be easy to selectively disable full RELRO. |
I have tested it, but I can't predict how soon an update may be released. Of course we have to be careful that it doesn't cause regressions elsewhere...
FWIW, I don't know if this affects RHEL5 too. Comment 25 indicates that this might have regressed from 6.6 to 6.7, but that could have been some happenstance change in load order. Also, RHEL5 is in the extended life phase, so I doubt this would be considered severe enough for an update.
You'd have to disable it on I think |
Actually, rust's dist-powerpc64-linux is only configured for ~RHEL6, so it would have problems with glibc symbol versions if you tried to run on RHEL5-ppc64 at all. I think only the x86 builders are older, and they're running on native CentOS 5 images, so they would probably fail directly while bootstrapping if that glibc had BIND_NOW problems. |
If this ends up being an powerpc64-specific bug then I think it's also fine to just disable it there with a comment indicating why. It also seems like the risk of breakage from this is pretty low, so we could always land it and see what happens? Backing out the change would be pretty easy as well, so I think it may not cause too much of a splash to implement. If we end up having problems during the x86 bootstrap within CentOS 5 the bors will discover that anyway as this won't be able to land! @cuviper what do you think of the "land and let's see" approach? Should we disable full relro on powerpc64 for now but leave partial relro enabled? I think regardless we'll hold off on this for a few days to land this after the beta branching on this coming Monday just to make sure we have a maximal amount of time to detect regressions. |
I'm not positive that bootstrap will exercise this in the CentOS 5 builders. In stage0 we'll build a stage1 without relro, then stage1 will build stage2 with relro. But then I think we don't actually execute stage2 for anything unless you run tests, right? (Even just Anyway, if we want to "land and let's see", I can make a point of trying the first relro nightly on the various arches that Red Hat and Fedora support. I would prefer we avoid BIND_NOW on powerpc64 though, since we already know that's problematic. |
Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
Hokay, so I've now:
|
|
||
pub fn target() -> TargetResult { | ||
let mut base = super::linux_base::opts(); | ||
base.cpu = "ppc64le".to_string(); | ||
base.pre_link_args.get_mut(&LinkerFlavor::Gcc).unwrap().push("-m64".to_string()); | ||
base.max_atomic_width = Some(64); | ||
|
||
// ld.so in at least RHEL6 on ppc64 has a bug related to BIND_NOW, so only enable partial RELRO | ||
// for now. https://github.com/rust-lang/rust/pull/43170#issuecomment-315411474 | ||
base.relro_level = RelroLevel::Partial; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ppc64le didn't exist until RHEL7, so I think we can leave that full.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, will undo that part, thanks.
On at least RHEL6 there is a segfault caused by the older ld.so version when BIND_NOW is used, so use partial RELRO by default on ppc64 architectures for now. Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
📌 Commit 2161fb2 has been approved by |
Add support for full RELRO This commit adds support for full RELRO, and enables it for the platforms I know have support for it. Full RELRO makes the PLT+GOT data read-only on startup, preventing it from being overwritten. http://tk-blog.blogspot.com/2009/02/relro-not-so-well-known-memory.html Fixes #29877. --- I'm not entirely certain if this is the best way to do it, but I figured mimicking the way it's done for PIE seemed like a good start at least. I'm not sure whether we want to have it enabled by default globally and then disabling it explicitly for targets that don't support it though. I'm also not sure whether the `full_relro` function should call `bug!()` or something like it for linkers that don't support it rather than no-opping.
☀️ Test successful - status-appveyor, status-travis |
@kyrias I try to use this and it doesn't seem to work for me. rustc version: |
@lilianmoraru It's currently under |
Add relro-level tests The `relro-level` debugging flag was added in #43170 which was merged in July 2017. This PR moves this flag to be a proper codegen flag.
[WIP] Support for disabling PLT for better function call performance This PR gives `rustc` the ability to skip the PLT when generating function calls into shared libraries. This can improve performance by reducing branch indirection. AFAIK, the only advantage of using the PLT is to allow for ELF lazy binding. However, since Rust already [enables full relro for security](#43170), lazy binding was disabled anyway. This is a little known feature which is supported by [GCC](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html) and [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fplt) as `-fno-plt` (some Linux distros [enable it by default](https://git.archlinux.org/svntogit/packages.git/tree/trunk/makepkg.conf?h=packages/pacman#n40) for all builds). Implementation inspired by [this patch](https://reviews.llvm.org/D39079#change-YvkpNDlMs_LT) which adds `-fno-plt` support to Clang. ## Performance I didn't run a lot of benchmarks, but these are the results on my machine for a `clap` [benchmark](https://github.com/clap-rs/clap/blob/master/benches/05_ripgrep.rs): ``` name control ns/iter no-plt ns/iter diff ns/iter diff % speedup build_app_long 11,097 10,733 -364 -3.28% x 1.03 build_app_short 11,089 10,742 -347 -3.13% x 1.03 build_help_long 186,835 182,713 -4,122 -2.21% x 1.02 build_help_short 80,949 78,455 -2,494 -3.08% x 1.03 parse_clean 12,385 12,044 -341 -2.75% x 1.03 parse_complex 19,438 19,017 -421 -2.17% x 1.02 parse_lots 431,493 421,421 -10,072 -2.33% x 1.02 ``` A small performance improvement across the board, with no downsides. It's likely binaries which make a lot of function calls into dynamic libraries could see even more improvements. [This comment](https://patchwork.ozlabs.org/patch/468993/#1028255) suggests that, in some cases, `-fno-plt` could improve PIC/PIE code performance by 10%. ## To do - [ ] Do a perf run to see the effect this has on the compiler (cc @michaelwoerister), and possibly run benchmarks on some more crates - [ ] Add a code gen test - [ ] Should this be always enabled or should it be behind a command line option? If so, what should it be called? `-Z no-plt`? `-Z plt=no`?
Support for disabling PLT for better function call performance This PR gives `rustc` the ability to skip the PLT when generating function calls into shared libraries. This can improve performance by reducing branch indirection. AFAIK, the only advantage of using the PLT is to allow for ELF lazy binding. However, since Rust already [enables full relro for security](#43170), lazy binding was disabled anyway. This is a little known feature which is supported by [GCC](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html) and [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fplt) as `-fno-plt` (some Linux distros [enable it by default](https://git.archlinux.org/svntogit/packages.git/tree/trunk/makepkg.conf?h=packages/pacman#n40) for all builds). Implementation inspired by [this patch](https://reviews.llvm.org/D39079#change-YvkpNDlMs_LT) which adds `-fno-plt` support to Clang. ## Performance I didn't run a lot of benchmarks, but these are the results on my machine for a `clap` [benchmark](https://github.com/clap-rs/clap/blob/master/benches/05_ripgrep.rs): ``` name control ns/iter no-plt ns/iter diff ns/iter diff % speedup build_app_long 11,097 10,733 -364 -3.28% x 1.03 build_app_short 11,089 10,742 -347 -3.13% x 1.03 build_help_long 186,835 182,713 -4,122 -2.21% x 1.02 build_help_short 80,949 78,455 -2,494 -3.08% x 1.03 parse_clean 12,385 12,044 -341 -2.75% x 1.03 parse_complex 19,438 19,017 -421 -2.17% x 1.02 parse_lots 431,493 421,421 -10,072 -2.33% x 1.02 ``` A small performance improvement across the board, with no downsides. It's likely binaries which make a lot of function calls into dynamic libraries could see even more improvements. [This comment](https://patchwork.ozlabs.org/patch/468993/#1028255) suggests that, in some cases, `-fno-plt` could improve PIC/PIE code performance by 10%. ## Security benefits **Bonus**: some of the speculative execution attacks rely on the PLT, by disabling it we reduce a big attack surface and reduce the need for [`retpoline`](https://reviews.llvm.org/D41723). ## Remaining PLT calls The compiled binaries still have plenty of PLT calls, coming from C/C++ libraries. Building dependencies with `CFLAGS=-fno-plt CXXFLAGS=-fno-plt` removes them.
Support for disabling PLT for better function call performance This PR gives `rustc` the ability to skip the PLT when generating function calls into shared libraries. This can improve performance by reducing branch indirection. AFAIK, the only advantage of using the PLT is to allow for ELF lazy binding. However, since Rust already [enables full relro for security](#43170), lazy binding was disabled anyway. This is a little known feature which is supported by [GCC](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html) and [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fplt) as `-fno-plt` (some Linux distros [enable it by default](https://git.archlinux.org/svntogit/packages.git/tree/trunk/makepkg.conf?h=packages/pacman#n40) for all builds). Implementation inspired by [this patch](https://reviews.llvm.org/D39079#change-YvkpNDlMs_LT) which adds `-fno-plt` support to Clang. ## Performance I didn't run a lot of benchmarks, but these are the results on my machine for a `clap` [benchmark](https://github.com/clap-rs/clap/blob/master/benches/05_ripgrep.rs): ``` name control ns/iter no-plt ns/iter diff ns/iter diff % speedup build_app_long 11,097 10,733 -364 -3.28% x 1.03 build_app_short 11,089 10,742 -347 -3.13% x 1.03 build_help_long 186,835 182,713 -4,122 -2.21% x 1.02 build_help_short 80,949 78,455 -2,494 -3.08% x 1.03 parse_clean 12,385 12,044 -341 -2.75% x 1.03 parse_complex 19,438 19,017 -421 -2.17% x 1.02 parse_lots 431,493 421,421 -10,072 -2.33% x 1.02 ``` A small performance improvement across the board, with no downsides. It's likely binaries which make a lot of function calls into dynamic libraries could see even more improvements. [This comment](https://patchwork.ozlabs.org/patch/468993/#1028255) suggests that, in some cases, `-fno-plt` could improve PIC/PIE code performance by 10%. ## Security benefits **Bonus**: some of the speculative execution attacks rely on the PLT, by disabling it we reduce a big attack surface and reduce the need for [`retpoline`](https://reviews.llvm.org/D41723). ## Remaining PLT calls The compiled binaries still have plenty of PLT calls, coming from C/C++ libraries. Building dependencies with `CFLAGS=-fno-plt CXXFLAGS=-fno-plt` removes them.
This commit adds support for full RELRO, and enables it for the
platforms I know have support for it.
Full RELRO makes the PLT+GOT data read-only on startup, preventing it
from being overwritten.
http://tk-blog.blogspot.com/2009/02/relro-not-so-well-known-memory.html
Fixes #29877.
I'm not entirely certain if this is the best way to do it, but I figured mimicking the way it's done for PIE seemed like a good start at least. I'm not sure whether we want to have it enabled by default globally and then disabling it explicitly for targets that don't support it though. I'm also not sure whether the
full_relro
function should callbug!()
or something like it for linkers that don't support it rather than no-opping.