-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move compiler-rt build into a crate dependency of libcore #34400
Comments
This can probably be done one platform at a time. Get it working on x86, then add a flag to the target spec that compiler rt isn't required. While I want to do this, one thing that bugs me about it is that it ruins the 'purity' of core. Right now core is a pure-Rust library, really easy to build, although you can't do anything with it in practice without compiler-rt. Maybe there's nothing to do about it and I just need to accept that Rust isn't completely pure. For now, at least. Once we've got the build system transitioned and isolated just the parts of compiler-rt that core needs, we can start thinking about rewriting compiler-rt in Rust. In light of those considerations, I think I'd prefer it if compiler-rt was its own crate that core depends on. Just seems like a cleaner division of responsibility. |
I was going to open a similar issue because I have already started implementing this idea here (but compiler-rt in its own crate rather than under core) 😄. Which is I have been using to cross compile compiler-rt for ARM Cortex-M processors. 👍 from me. I pretty much agree with @alexcrichton assessment:
nods the user won't need to have
Agree. I might add that updating our hacks on top of their build system across compiler-rt updates is going to be equally difficult anyway. re: steps
I think we can start with all the "generic" C routines. IME, those (cross) compile fine to all the architectures I have tried (x86, arm, mips). The arch-specific assembly (
I'm not quite sure what this means. Surely you don' mind stop passing the IMO OTOH, compiler-rt is mandatory when you build programs that depend on |
Can confirm; I only use libcore, not compiler-rt. |
True yeah, I kinda would prefer to build nothing and then only add in intrinsics as we actually run into them. This may not be a great test, however, as we may accidentally pick up routines from libgcc all the time instead.
Oh yeah I just mean the compiler injecting
Yeah it's true that most programs don't, but for us to be robust we actually need to deal with this. Functions like 64-bit multiplication with overflow on 32-bit end up lowering down to these intrinsics I believe, and a few others here and there. Basically I don't think we can get by with not shipping a compiler-rt, so for tier 1 platforms we need to at least have some form of it somewhere. This should in theory then extend very nicely to embedded platforms as they should all be able to compile C anyway, so we just need to be sure the compiles actually succeed. |
Also thinking far off in the future if we go down this road:
|
I would personally expect sanitizers to be present in separate crates, not dependencies of libcore (or part of libcore). I'd like to support them eventually but I'd also like to build them separately. Also yeah I think that the fallback to |
I would expect this. Let's package it up the same as any other crate, so the compiler doesn't specifically know about it.
I think @alexcrichton's idea was that no consumer of Rust should ever have to think about compiler-rt. It just comes with core as appropriate for the target, and everybody already depends on core. If a particular kernel project doesn't actually depend on the symbols defined in compiler-rt then that's fine - the linker will just throw them away, just like every other unused part of core. If a particular platform doesn't need compiler-rt at all, then core just won't link to it. |
@brson I see what @alexcrichton meant now. Thanks for elaborating!
So, start with nothing, run the full Rust test suite and crater, add missing symbols and repeat.
You mean that libgcc also provides intrinsics with the same name? And those get picked up instead of compiler-rt's?
I see what you mean now. Yeah that would be better. It would let us drop (as in "stop caring about") the no-compiler-rt field. I think I tried something like this before but the linker dropped the libcompiler-rt.rlib symbols when linking a static no_std binary; it could have just been a linker argument ordering issue though.
I meant building the sanitizers with gcc-rs instead of with llvm's cmake build system. I agree they should not be part of libcore. They would probably end up in their own crate below the std crate. |
Crater may not be too much help here but if we can get past the auto bots and compile all necessary symbols for all platforms I suspect that'd be all we need. We tend to be pretty good about coverage of the standard library in general, although not bullet proof of course.
Yeah I believe that at least some intrinsic names overlap, although perhaps not all of them. Most of the time you end up linking to libgcc in one way or another if you start including some C code, so we may require some intrinsic names that end up getting pulled in from libgcc instead of from compiler-rt as before. In general though this is just a vague worry. If things compile and link just fine then it doesn't really matter too much where it comes from, presumably we'll solve it at some point.
Ah for these we'd probably want to use cmake. I know far less about the sanitizers and they seem like they would be more complicated. That and the sanitizers don't need to work on every platform rustc supports so we may not need to cut corners. |
A crazy idea: could we translate all the builtins that we need into Rust? The .c code in compiler-rt doesn't look too complicated. Then we'd depend only on assembler (and we could even bundle llvm-as with Rust). Of course, this will increase the cost of updating, but as @brson said, the builtins subset of compiler-rt seems to be pretty stable. Edit: nevermind, brson has already proposed that. Serves me right for reading on a phone! |
Yeah at this point I'm about ready to just throw in the towel on using vanilla upstream compiler-rt, it's too much of a pain so far and doesn't seem to have enough gain. If we can figure out what LLVM actually needs (in terms of intrinsics), I'd be totally ok adding those to libcore (or some other lib) and just going from there. |
I think this command yields most of (if not all) the intrinsics LLVM uses:
The output looks like this (Spoilers there are lots of them). |
I think these are just platform-specific overrides. The full list is here. |
Nice find! I think not all of those intrinsics are used though. For example, |
Could we simply pre-build all possible versions of compiler-rt and distribute it as a binary? I think the set of (CPU targets * calling conventions) is fixed by LLVM implementation, so it should be feasible if we used clang? |
We tend to always run into trouble with distributions of binaries, and with things like target-specific flags and such I'd suspect that we can't really ship an end-all-be-all build of compiler-rt. I'd personally prefer to get it working building from source so we can just build everything from source eventually. |
This should definitely be part of the long term plan. |
The 128-bit integer RFC requires rewriting some of the compiler-rt builtins in Rust. |
Actually, that list doesn't contain the
I wrote a script to get a list of the intrinsics that overlap; the result is in this gist. Note that the LLVM intrinsics listed there are a (probably overestimated) guess; we haven't found a definite list of intrinsics that LLVM may emit on all platforms. |
The list I pointed to is a default initialization. The target backends then override some of the entries with platform-specific names, so all _aeabi stuff is in ARMISelLowering.cpp. |
I tagged this with help wanted assuming no one is otherwise going to jump on this. Someone a bit familiar with how rustc and std interpenetrate could get started creating a compiler-rt crate and linking it in somewhere. |
It may be a tangent, but would it be possible for this to use a system compiler-rt? Fedora has a compiler-rt package, and I verified its |
@cuviper perhaps yeah, if we follow the route in this issue it may be relatively trivial with Cargo's library override support depending on how we do it. |
We're upgrading compiler-rt in #34743 as part of an LLVM upgrade, and it's again unfortunately quite an ordeal due to the sheer number of patches we need to apply (many of which don't apply cleanly any more). |
@alexcrichton Can you give a brief overview of what sort of compiler-rt patches there are? That will help me decide the feasibility of using the distro package. (FWIW, I already have makefile patches semi-working to allow external compiler-rt, and hope to send a PR soon.) |
OK, sounds good, thanks. |
I'm working on this. I think I got everything but the Makefiles done. I'm smoke testing right now by cross compiling std to |
@alexcrichton Is there an open issue for this somewhere? I'd like to know what this means in more detail. |
@bstrie it's this issue! We just need to be sure to lazily compile compiler-rt as part of |
If compiler-rt were rewritten in Rust, then would non-Rust code be able to use the Rust-coded version? For example, let's say I've linked a program using 4 LLVM-based programming languages together. Would I have multiple copies of the compiler-rt logic linked into my program? That would suck. A much simpler solution to rewriting compiler-rt would be to just include llvm-as and clang in the Rust distribution, so that we can cross-compile the asm and C code when (re)building std. Besides eliminating the need to rewrite compiler-rt, this would benefit lots of users of Rust as they could then rely on llvm-as and clang being available. |
Presumably yes, the intrinsics in the Rust version will be exposed in the same way as the intrinsics in libcompiler-rt.a. (Side note: On failure, the compiler-rt intrinsics call an "abort" function defined in compiler-rt; I'm not sure if that should become
No, the linker will use the version that "encounters first" and ignore the rest. Here first means: from all the objects passed to the linker as arguments, the symbol found in the first object (argument) that provides it. Note that this happens today for all |
libcompiler-rt.a is dead, long live libcompiler-builtins.rlib This commit moves the logic that used to build libcompiler-rt.a into a compiler-builtins crate on top of the core crate and below the std crate. This new crate still compiles the compiler-rt instrinsics using gcc-rs but produces an .rlib instead of a static library. Also, with this commit rustc no longer passes -lcompiler-rt to the linker. This effectively makes the "no-compiler-rt" field of target specifications a no-op. Users of `no_std` will have to explicitly add the compiler-builtins crate to their crate dependency graph *if* they need the compiler-rt intrinsics. Users of the `std` have to do nothing extra as the std crate depends on compiler-builtins. Finally, this a step towards lazy compilation of std with Cargo as the compiler-rt intrinsics can now be built by Cargo instead of having to be supplied by the user by some other method. closes rust-lang#34400
crate-ify compiler-rt into compiler-builtins libcompiler-rt.a is dead, long live libcompiler-builtins.rlib This commit moves the logic that used to build libcompiler-rt.a into a compiler-builtins crate on top of the core crate and below the std crate. This new crate still compiles the compiler-rt instrinsics using gcc-rs but produces an .rlib instead of a static library. Also, with this commit rustc no longer passes -lcompiler-rt to the linker. This effectively makes the "no-compiler-rt" field of target specifications a no-op. Users of `no_std` will have to explicitly add the compiler-builtins crate to their crate dependency graph *if* they need the compiler-rt intrinsics - this is a [breaking-change]. Users of the `std` have to do nothing extra as the std crate depends on compiler-builtins. Finally, this a step towards lazy compilation of std with Cargo as the compiler-rt intrinsics can now be built by Cargo instead of having to be supplied by the user by some other method. closes #34400 --- r? @alexcrichton
One of the major blockers of our dream to "lazily compile std" is to ensure that we have the ability to compile compiler-rt on-demand. This is a repository maintained by LLVM which contains a large set of intrinsics which LLVM lowers function calls down to on some platforms.
Unfortunately the build system of compiler-rt is a bit of a nightmare. We, at the time of the writing, have a large pile of hacks on its makefile-based build system to get things working, and it appears that LLVM has deprecated this build system anyway. We're trying to move to cmake but it's still unfortunately a nightmare compiling compiler-rt.
To solve both these problems in one fell swoop, @brson and I were chatting this morning and had the idea of moving the build entirely to a build script of libcore, and basically just using gcc-rs to compile compiler-rt instead of using compiler-rt's build system. This means we don't have to have LLVM installed (why does compiler-rt need llvm-config?) and cross-compiling should be much more robust/easy as we're driving the compiles, not working around an opaque build system.
To make matters worse in compiler-rt as well it contains code for a massive number of intrinsics we'll probably never use. And even worse these bits and pieces of code often cause compile failures which don't end up mattering in the end. To solve this problem we should just whitelist a set of intrinsics to build and ignore all others. This may be a bit of a rocky road as we discover some we should have compiled but forgot, but in theory we should be able to select a subset to compile and be done with it.
This may make updating compiler-rt difficult, but we've already only done it once in like the past year or two years, so we don't seem to need to do this too urgently. This is a worry to keep in mind, however.
Basically here's what I think we should do:
Staging this is still a bit up in the air, but I'm curious what others think about this as well.
cc @rust-lang/tools
cc @brson
cc @japaric
The text was updated successfully, but these errors were encountered: