-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
illumos should put libc last in library search order #84254
Conversation
(rust-highfive has picked a reviewer for you, use r? to override) |
Some additional notes on the symbols in question. You can see that some symbols are implemented by both
Of the symbols that don't overlap, these are presently used within Rust:
With this change, I have been able to implement backtrace-rs support for illumos systems (all examples now work and all tests now pass) via: jclulow/backtrace-rs@c9072be |
This comment has been minimized.
This comment has been minimized.
9a10b5f
to
13bc380
Compare
r? @petrochenkov The preferred way to link native libraries is via attributes in https://github.com/rust-lang/rust/blob/master/library/unwind/src/lib.rs if it's an unwinding library, attributes in https://github.com/rust-lang/libc/blob/master/src/unix/mod.rs if it's a part of libc, or attributes in https://github.com/rust-lang/rust/blob/master/library/std/src/sys/unix/mod.rs if it's just some system library used by libstd. |
Fair enough. It's a challenge, because we have an ordering constraint that needs to hold here (as I described above) that isn't really, as far as I can tell, something that is correctly handled by all of the other machinery today. As for nostd mode, on an illumos system even nostd binaries will link against
As far as I can tell, we are always correctly including both When this happens, we lazily resolve each symbol at runtime using the first entry in the NEEDED list that contains it. If I think I have found another place to perform this surgery, though. Looking at rust/compiler/rustc_codegen_ssa/src/back/link.rs Lines 1486 to 1533 in b0c818c
... I can see that we're going to call rust/compiler/rustc_codegen_ssa/src/back/link.rs Lines 2129 to 2161 in b0c818c
It's perhaps a bit unfortunate that additional user link flags, via rust/compiler/rustc_codegen_ssa/src/back/link.rs Lines 1737 to 1765 in b0c818c
Does that seem more appropriate? Can you think of a better way to enforce this ordering constraint in the linker arguments? |
So, the order of the One way to get this to happen could be to, on illumos targets, to have a strategically placed |
Thanks for looking @nagisa!
Are you saying we would remove this Lines 35 to 36 in b0c818c
And instead, at the end of of https://github.com/rust-lang/rust/blob/b0c818c5e0fa37251d9fda2f656bf1041a2e1e1d/library/unwind/src/lib.rs we would add something like... #[cfg(target_os = "illumos")]
#[link(name = "gcc_s")]
#[link(name = "c")]
extern "C" {} Is my understanding correct? |
You can try a similarly strategically placed invocation of
in the |
I forgot mentioning #73319 ("Linked native libraries are sometimes deduplicated and sometimes not deduplicated"), which may require using build scripts instead of (preferred) attributes here.
If the compiler requires special casing for such a regular task like linking ordering, then it is doing something wrong and doesn't provide some necessary user-visible functionality, and we need to add such functionality instead of special cases. (Which probably amounts to fixing #73319 in this specific case, but there's fortunately a workaround with build scripts as I said above.) |
@nagisa Thanks for your help so far! I gave your suggestion a try, with the following patch in place of this PR: diff --git a/library/unwind/build.rs b/library/unwind/build.rs
index d8bf152e4d6..3bb9682a615 100644
--- a/library/unwind/build.rs
+++ b/library/unwind/build.rs
@@ -33,7 +33,15 @@ fn main() {
} else if target.contains("solaris") {
println!("cargo:rustc-link-lib=gcc_s");
} else if target.contains("illumos") {
+ // Both libgcc_s and libc contain an unwinder implementation, but on
+ // some systems the libc unwinder does not contain all of the extension
+ // symbols in libgcc_s that are used by Rust. We include libc here
+ // explicitly in an attempt to ensure it will always appear after
+ // libgcc_s in the library search order in the output object, so that we
+ // do not end up using part of an unwinder implementation from two
+ // different libraries.
println!("cargo:rustc-link-lib=gcc_s");
+ println!("cargo:rustc-link-lib=c");
} else if target.contains("dragonfly") {
println!("cargo:rustc-link-lib=gcc_pic");
} else if target.contains("pc-windows-gnu") { This did appear to improve the situation with the
The compiler itself doesn't require a special case, and isn't really doing anything wrong. In a traditional Makefile-driven build system where the consumer has control over what flags are passed to the compiler, one would always arrange to place The runtime linker will (in most simple cases like this) always allow implicit symbol interposition based on the order that libraries appear in the NEEDED list. On our platform in particular, the C library is the gateway to many base facilities like system calls and signal handling, but it also includes other base OS facilities like a minimal unwinder implementation and some stack protector functions. It will likely include other symbols in the future. Whomever (or in the case of Rust, whatever) is deciding what to pass to the linker needs to be aware of this. I think that adding
Because the Rust toolchain seeks to stand in for the combination of the erstwhile developer assembling manually a set of Does that make sense? |
This makes sense, but I still disagree with some points.
You can always use syscalls directly (at cost of less portability), and sometimes you have to, for example when writing some binary instrumentation code that shouldn't interfere with the application code (including libc) or writing a dynamic linker. Cases like this are exactly why
In case of Rust it is libstd, libunwind (rust crate) and libc (rust crate), not the compiler. UPD: I'd be ok with the PR changing only the target spec (at least as a temporary step), it is the |
That may well be true for some other platforms, but isn't really true on illumos. There isn't really any reason I can think of that would make direct system calls an attractive solution. Indeed, necessary parts of the signal handling machinery (amongst other things) are part kernel and part If you want to instrument a process, we have many better tools for doing just that; e.g., process control through proc(4), or static or dynamic instrumentation through DTrace. You may be interested in our crate for adding USDT probe sites to a Rust program. There is one case I can think of where
I'm trying to make the point that, aside from this specific instance (
I don't believe that would have any effect, as the problem is not that Rust may well grow, in future, a more complete system for allowing targets to express these sorts of platform constraints; e.g., by presenting the full abstract link order to the target code (before it becomes linker arguments) so that it may be post-processed, or by accepting a per-target list of partial ordering rules such as " In the interim, though, this solution has the benefit of being explicit in enacting the constraint, easy to understand and to verify, and doesn't have a blast radius that affects any other platform. |
The compiler, ideally, should have as little target specific logic outside the target definitions as possible. As this PR is implemented right now, a pretty core portion of the backend now has a illumos-specific workaround. We had similar work-arounds for other targets and it has been and is a source of hard to address issues. Libraries and users start relying on such workarounds and eventually prevent a proper fix. Most obvious in the case of this PR – crate authors might neglect a I don't think it is useful here to discuss whether linking to I can see that the requirement is to have |
Deduplication happens in favor of the last use (assuming it happens). Also, we can make the linking order of libraries from So |
Do you mean by swapping the order of these two Lines 354 to 361 in 1a6c98e
|
Yeah, those two. |
I've built a toolchain with the following delta: diff --git a/library/std/src/lib.rs b/library/std/src/lib.rs
index 90603cd9836..1f15af0c58b 100644
--- a/library/std/src/lib.rs
+++ b/library/std/src/lib.rs
@@ -348,15 +348,19 @@
#[allow(unused_imports)] // macros from `alloc` are not used on all platforms
#[macro_use]
extern crate alloc as alloc_crate;
-#[doc(masked)]
-#[allow(unused_extern_crates)]
-extern crate libc;
// We always need an unwinder currently for backtraces
#[doc(masked)]
#[allow(unused_extern_crates)]
extern crate unwind;
+// On some platforms an unwinder is present in both libc and another library;
+// e.g., in libgcc_s on illumos. Depend on libc after unwind in order to arrive
+// at the correct library search order in the output object; i.e., libc last.
+#[doc(masked)]
+#[allow(unused_extern_crates)]
+extern crate libc;
+
// During testing, this crate is not actually the "real" std library, but rather
// it links to the real std library, which was compiled from this same source
// code. So any lang items std defines are conditionally excluded (or else they
diff --git a/library/unwind/build.rs b/library/unwind/build.rs
index d8bf152e4d6..3bb9682a615 100644
--- a/library/unwind/build.rs
+++ b/library/unwind/build.rs
@@ -33,7 +33,15 @@ fn main() {
} else if target.contains("solaris") {
println!("cargo:rustc-link-lib=gcc_s");
} else if target.contains("illumos") {
+ // Both libgcc_s and libc contain an unwinder implementation, but on
+ // some systems the libc unwinder does not contain all of the extension
+ // symbols in libgcc_s that are used by Rust. We include libc here
+ // explicitly in an attempt to ensure it will always appear after
+ // libgcc_s in the library search order in the output object, so that we
+ // do not end up using part of an unwinder implementation from two
+ // different libraries.
println!("cargo:rustc-link-lib=gcc_s");
+ println!("cargo:rustc-link-lib=c");
} else if target.contains("dragonfly") {
println!("cargo:rustc-link-lib=gcc_pic");
} else if target.contains("pc-windows-gnu") { This appears to be sufficient to get the correct NEEDED list in all of the binaries shipped with the toolchain. It still isn't working for the backtrace crate tests, though. I have isolated one of the tests that doesn't work, and used our #!/bin/bash
(
date
pargs $$
echo
unset LD_ALTEXEC
/usr/bin/ld -Dfiles,unused "$@"
rc=$?
echo
echo exit: $rc
echo
) >>/var/tmp/capld.args.txt 2>&1
exit $rc I then did
From the debugging output of
The full output is very long so I have placed it in a gist: https://gist.github.com/jclulow/462c838eb717acae7bc87db6165359ec The final NEEDED list from the above linker invocation is:
I'm a bit of a loss as to what do next to fix this, and would appreciate some guidance. Thanks! |
I would like to avoid a workaround that would require us to go and revisit the many crates that we've already put platform support work into. PRs can take weeks or months to get merged sometimes, and there's nothing stopping any crate in the ecosystem from including a dependency on the native Can we please take another look at the original patch I proposed, with the full understanding that when whatever larger project to improve linker interaction and library deduplication is completed, this can be a temporary fix that we can remove from the code? |
The root problem is not even linking ordering, but that two different libraries in the dependency tree have public symbols that have same names but are not incompatible. If you have any two such conflicting libraries |
This isn't really true, though, because both |
Nothing like this is required, the only thing we need to update by publishing a new patch version is the |
OK, what about: I can add a new field to In this way, we can get what we need, and the |
Status update: #83507 will land soon and for #38460/#73319 we'll likely end up with an aggressive deduplication strategy by default and an opt-out with a modifier. So, yeah, let's land some temporary hack that can be easily reverted later without touching other repos like libc. |
This comment has been minimized.
This comment has been minimized.
Under some conditions, the toolchain will produce a sequence of linker arguments that result in a NEEDED list that puts libc before libgcc_s; e.g., [0] NEEDED 0x2046ba libc.so.1 [1] NEEDED 0x204723 libm.so.2 [2] NEEDED 0x204736 libsocket.so.1 [3] NEEDED 0x20478b libumem.so.1 [4] NEEDED 0x204763 libgcc_s.so.1 Both libc and libgcc_s provide an unwinder implementation, but libgcc_s provides some extra symbols upon which Rust directly depends. If libc is first in the NEEDED list we will find some of those symbols in libc but others in libgcc_s, resulting in undefined behaviour as the two implementations do not use compatible interior data structures. This solution is not perfect, but is the simplest way to produce correct binaries on illumos for now.
13bc380
to
31c2ad0
Compare
@bors r+ rollup |
📌 Commit 31c2ad0 has been approved by |
…chenkov illumos should put libc last in library search order Under some conditions, the toolchain will produce a sequence of linker arguments that result in a NEEDED list that puts libc before libgcc_s; e.g., [0] NEEDED 0x2046ba libc.so.1 [1] NEEDED 0x204723 libm.so.2 [2] NEEDED 0x204736 libsocket.so.1 [3] NEEDED 0x20478b libumem.so.1 [4] NEEDED 0x204763 libgcc_s.so.1 Both libc and libgcc_s provide an unwinder implementation, but libgcc_s provides some extra symbols upon which Rust directly depends. If libc is first in the NEEDED list we will find some of those symbols in libc but others in libgcc_s, resulting in undefined behaviour as the two implementations do not use compatible interior data structures. This solution is not perfect, but is the simplest way to produce correct binaries on illumos for now.
…chenkov illumos should put libc last in library search order Under some conditions, the toolchain will produce a sequence of linker arguments that result in a NEEDED list that puts libc before libgcc_s; e.g., [0] NEEDED 0x2046ba libc.so.1 [1] NEEDED 0x204723 libm.so.2 [2] NEEDED 0x204736 libsocket.so.1 [3] NEEDED 0x20478b libumem.so.1 [4] NEEDED 0x204763 libgcc_s.so.1 Both libc and libgcc_s provide an unwinder implementation, but libgcc_s provides some extra symbols upon which Rust directly depends. If libc is first in the NEEDED list we will find some of those symbols in libc but others in libgcc_s, resulting in undefined behaviour as the two implementations do not use compatible interior data structures. This solution is not perfect, but is the simplest way to produce correct binaries on illumos for now.
Rollup of 9 pull requests Successful merges: - rust-lang#84254 (illumos should put libc last in library search order) - rust-lang#84442 (Unify rustc and rustdoc parsing of `cfg()`) - rust-lang#84655 (Cleanup of `wasm`) - rust-lang#84866 (linker: Avoid library duplication with `/WHOLEARCHIVE`) - rust-lang#84930 (rename LLVM target for RustyHermit) - rust-lang#84991 (rustc: Support Rust-specific features in -Ctarget-feature) - rust-lang#85029 (SGX mutex is movable) - rust-lang#85030 (Rearrange SGX split module files) - rust-lang#85033 (some further small cleanups) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Under some conditions, the toolchain will produce a sequence of linker
arguments that result in a NEEDED list that puts libc before libgcc_s;
e.g.,
Both libc and libgcc_s provide an unwinder implementation, but libgcc_s
provides some extra symbols upon which Rust directly depends. If libc
is first in the NEEDED list we will find some of those symbols in libc
but others in libgcc_s, resulting in undefined behaviour as the two
implementations do not use compatible interior data structures.
This solution is not perfect, but is the simplest way to produce correct
binaries on illumos for now.