-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider adding Linux targets that don't depend on libc #2610
Comments
This sounds really cool! However, I think you may be underestimating how much work this is. There would also need to be collaboration with kernel developers about system calls and stuff. |
@mark-i-m I agree that it is quite a bit of work, but there is no need for collaborating with kernel devs. The syscall interface is considered stable and they treat any change that'll break userspace as a bug. Go has successfully implemented this strategy on all OSs they support. So I'd argue it is pretty doable and in my opinion desirable. |
Most of the stuff was already in place, as you can see in @japaric's comment. :) |
@lorenz Yes, you are right, but what I mean is that if we want to influence the syscall interfaces to make them more Rust-friendly we would need to interact more with the kernel devs... |
@mark-i-m What do you have in mind that you want changed? The syscall interface is totally usable from Rust and is in various ways a better fit than libc itself. |
Admittedly, I haven't yet gotten to thinking extensively about this, but mainly my thinking is that syscalls often don't even attempt to by safe, so adding good Rust wrappers around them is either painful or inefficient. For example, it is hard to write a good safe wrapper for |
But this is not about implementing all possible safe uses of (for example) |
No, but they do talk with e.g. libc maintainers to discuss interfaces, libc support, etc.
That's true, but this is also an opportunity to expose the kernel ABI in a safer way, although that doesn't seem to be the intent of this issue... |
Any newly-proposed syscall will at best be present in the next kernel version, but it’ll be a number of years before you can usefully ship a program that just assumes it is present. |
Yes, but Rust can take advantage of them right away by falling back to something else when it receives |
Sure but if a fallback is needed anyway there is no point in doing all this only "to make them more Rust-friendly", is there? (As opposed to, say, a performance optimization.) |
@mark-i-m It sounds like you're interested in something different than this is proposing. @japaric is proposing a standard library that's equivalent to the current If someone wanted to add new syscalls to Linux, for whatever reason, that would be outside the scope of this proposal. |
As a separate thought, for a target like this, we could theoretically have a libc-compatible C library implemented in Rust, and then link C programs to that library. We don't need that, but it would make for an interesting future addition to this target. |
https://gitlab.redox-os.org/redox-os/relibc#relibc "relibc is a portable POSIX C standard library written in Rust."
Edit: |
Looking forward to playing with this libc-free target! I wonder if it will be possible to later remove |
IIUC |
There is a news : A libc in LLVM |
Does adding new targets require a rfc? Or were large changes required to the codebase? |
Hi, My goal is to push rust safety's guarantees as much as possible. currently it stops at the entrance to libc. so I'm writing a rust libc implementation. I already seen parts of libc that have UB potential if you misuse them. something that rust can easily solve. But. I want this to be a usable target, so I propose to make a new target that will still link to libc, but will slowly move away from using libc, currently I see 3 areas where libc is actually needed:
For the first one, I personally think it's fine, because then LTO should remove libc for users who don't use that part of libstd. For the second one I think the right path forward is a rewrite in rust. but I understand that this can take a while. For the third one, this should be a public discussion. do we want to just copy the pthreads implementation? or do we want to invent our own threads that makes more sense for rust? if so how will that affect FFI to C? Until these questions are answered I still think it's a big win to have a target that for the short run tries to minimize the calls to libc but for the long run plans to remove it completely. Would love people's feedback on this. To emphasize. I don't want just a musl rewrite in rust with big blobs of unsafe. My whole point is to extend the safety because: |
To me “libc implementation” means something specific. It provides C APIs as defined in the C standard and POSIX, and tries to be compatible with other implementations in the same way that musl and glibc are compatible with each other. relibc is a libc implementation written in Rust. It seems that you don’t mean that, but rather something that “spiritually” fills a role similar to that of libc in communicating with the kernel through syscalls and providing higher-level abstractions?
Or do you mean something used not instead of libc, but an abstraction on top of it?
As far as I can tell, |
I'm talking about a rust replacement for libc. I'm not planing of exposing FFI functions to C. I hope that in the long run it will prove itself safer and overall better. then we'd want to use it in libstd by default even if you still link to C code that'll use libc. |
If you’re changing signatures anyway, it doesn’t have to stay close to “the same functions” at all. Do you need any well-defined abstraction between libstd’s public API and syscalls? Why shouldn’t |
It could, although open is used directly also in and the point is levels of abstractions, you can have a function that calls directly 4 syscalls and handles all their errors etc. and then say that that function is the safer api. but for good better abstractions it's better to split each syscall and the handling of the data/errors to it's own file. that way it's also way easier to review carefully. |
There is also a lot of stuff you can't do with rusts stdlib at the moment, some which I've needed recently are:
While there are libraries doing these things, they use the libc crate and are usually plastered with unsafe code. |
Every programming problem I have, when I track it down to it's source, seems to originate with C/C++. It wasn't till a few years ago that I realized how seriously everything I do somehow, some way, has C/C++ as a foundation. Basically every zero-day exploit in my cyber security class is because of something stupid in C/C++. And it goes well beyond security, the more I dive into C++ the more terrible stuff I find. When I found out even Rust needed the clib, it was like seeing an iron-clad fortress only to look closer and see it was being held up by sticks, ducktape, and prayers. A perfect example is two days ago I was trying to convert an RGB array to a PNG ... in Python. All the python libraries for png's; they have messy C/C++ dependencies. I wanted to see how the image conversion was implemented so I go to libpng's website. And whatdyahknow, there's a warning about a major security flaw right there at the top of the page. But I don't even care about security; I just care about correctness and clarity. Libpng was your typical big messy confusing C codebase. So what do I do? I search for a Rust PNG lib. (Again, I'm coding Python, yet my workflow is pulling me into lower levels of code). Not only was there a pure Rust implementation, not only does it avoid ever using unsafe, but most importantly to me: I can fricken understand the code at a high level. I have about 5 years of frequent C++ experience, and about 3 months of Rust experience, yet when I want to understand an implementation I go to the Rust codebase. TLDR: Musl is great and all, but I actually want to be able to read, understand, and trust the implementation, not just wrap up my code into a static binary. And if there was one thing that SHOULDN'T be C in my opinion it would be the very foundation of Rust itself. |
@sunfishcode |
Thanks for the feedback! I appreciate hearing people's thoughts :-). A few updates: @nivkner implemented I've also started a branch to explore another side of the space, porting Rust's std to rustix. See here for the current status. |
From my PoV, most of the important projects for which I use Rust will have C/C++ linked in process for the forseeable future. I know I'm not alone in that. I can certainly understand the desire for something like mustang, but I think the concern boils down to: is the benefit of adding this new target worth the maintenance overhead? A somewhat related concern I have is proliferation and duplication of low-level crates. Most in the community seem OK with the way crates.io works versus having a larger standard library. But there are tradeoffs with this. For example, my company's software needs to be multi-arch across beyond just aarch64 and x86_64, also to ppc64le and s390x. Historically, they mostly needed to adapt to changes in the Linux kernel and glibc. Now of course in the Linux ecosystem there's also musl and bionic, but rustix is adding another one. That's not without cost. rustix heavily overlaps with nix - which is a dependency of tokio for example. Related to this, the distributed crates.io ecosystem also means that there's much less uniformity in things like continuous integration systems. For example, while it's great that rustix' CI flow tests many platforms, it does not cover all of the tier 2 targets (including the two I need to support mentioned above, ppc64le and s390x). If rustix is a dependency of std, this type of thing would need to change. Yet another related issue here - and this is my personal opinion - is lack of uniformity in peer review. Changes to rust-lang/rust require peer review, as do changes to glibc. Now, you are clearly an amazingly prolific code writer. But as far as I can tell, most code changes to rustix merge without any peer review. I would hope that as part of having Rust std depend on rustix, this could be improved; that seems like a natural evolution from "PoC" to production. All these concerns aside, from my PoV all the code and design in the whole dependency chain you've created from cap-std down to rustix, the safe IO bits etc. looks quite good! I still have on my TODO list trying out switching from the |
This is safer and may actually fix a race condition I've seen sometimes in CI runs. Part of investigating using rustix (and cap-std) in our section of the ecosystem (xref rust-lang/rfcs#2610).
Surely if Rustix became a dependency of std, it would cease being an external crate and merge onto the main rust tree? |
Another related thing here is basically that the libc developers are saying "you either use libc or you don't": best encapsulated by this thread: rust-lang/rust#89522 (comment) Right now, rustix seems to have a clean design where it can be configured in either way, which is good. But I am sure the temptation will be high in some cases to start making direct syscalls in the libc backend when e.g. libc is lagging on binding a particular syscall. And that brings the issues in the above thread. Again though I'll say rustix is cool (as is the mustang experiment), and from my PoV there is a clear need for it to exist given the fact that we need a high level interface in But, perhaps instead it makes sense to try to push for using rustix instead of nix in the ecosystem ahead of using it in std? I previously looked at this a bit in this discussion, and I can try some more (edit: OK did some over here). |
I very much agree. These projects are at a "let's talk about what we might want this to look like" phase, rather than any kind of official proposal, or even a pre-RFC. They definitely have growing and maturing to do, both as codebases and as projects. There are certainly costs in duplication and churn. But there are also costs in stagnation and not exploring. A theme I've seen come up in several places recently is that if one looks at the projects as being about just one goal, like "a libc-free Rust (on the one popular OS where that even makes sense)", or "micro-optimizing syscall wrappers (which aren't the part of syscalls which take most of the time)", it looks less interesting. The key is to look at the combined goals, which also include factoring out As for starting with the ecosystem ahead of std, I've started a few things in this space too, and am interested in doing more. And I agree, it makes sense to do more of this before making any official proposals. At the same time, it seems useful to start the conversations about std too, because if we need to make major changes to support std, it's easier if we can get those changes started sooner rather than later. |
This is safer and may actually fix a race condition I've seen sometimes in CI runs. Part of investigating using rustix (and cap-std) in our section of the ecosystem (xref rust-lang/rfcs#2610).
This is safer and may actually fix a race condition I've seen sometimes in CI runs. Part of investigating using rustix (and cap-std) in our section of the ecosystem (xref rust-lang/rfcs#2610).
Mustang is now using the new futex-based Mutex/RwLock/Condvar implementations that @m-ou-se recently developed for std instead of parking_lot, which avoids the need to do global allocation from within the locking primitives. This makes Mustang's global allocator setup much simpler. The 32-bit x86 port is now back online. And origin no longer needs a git dependency on a parking_lot fork so it's now updated on crates.io so we can read the docs on docs.rs. |
I'm now placing the project to work toward Mustang becoming an official Rust target on hold. I'll continue to maintain it for the foreseeable future. If anyone has ideas about what they would like to see Mustang become, please reach out. |
@elichai, one thing we've released as part of https://github.com/facebookexperimental/reverie is the "reverie-syscalls" crate, that provides one type-safe view of how to construct Syscall calls themselves (e.g. with proper argument types). Not sure how this compares with your effort or the nc crate, which seems to be receiving recent development. It's designed for use with ptracers, so it would require some adaptation to be used to directly invoke syscalls, and maybe @jasonwhite can comment on whether that would make sense. It probably wouldn't be zero cost (without significant compiler optimization) because you do construct structs to represent the syscalls. But the macros used to derive these structs could probably also derive a direct function interface for invoking the syscalls, while retaining one source-of-truth for what are the syscalls, their numbers, and their argument types. |
I'm wondering if someone here can redirect me to the best place for building Rust no_std programs without libc (musl or gnu). That's a much less lofty goal than reimpleminting libc, as described throughout this issue. It seems like Rust OS kernel projects have their own setup, but I'm not seeing any ergonomic way to just build a static binary that doesn't link libc (the way you can with |
If you don't want libc then you don't want to use linux either, so you probably want a binary target that ends with "none". |
libc != linux. Why should everyone want libc? |
Mustang is able to build Rust programs on Linux without linking in any libc today. It's complete enough to run |
I could not be more interested in this. My particular use case is: at $work we distribute software which links to a shared library in a dizzying array of exotic environments. It is a constant annoyance supporting these toolchains and system libraries. Why not produce a library with musl statically linked you say? One such environment only supports dynamically linking native libraries. ☕ 👀 We temporarily solved this by compiling in a similarly ancient build environment and a now-aging rust toolchain which supports glibc 2.11 (https://blog.rust-lang.org/2022/08/01/Increasing-glibc-kernel-requirements.html) For less exotic target platforms we've now started using cargo-zigbuild which has allowed us to move some build infra to newer toolchains without precluding some backward glibc compatibility. There are a number of reasons this situation is less than ideal:
TLDR; Removing even one term from the exponential expansion of build target's is worth the effort. Aaand, in this case it's actually two terms the |
@sunfishcode it seems like you are working on eyra, which appears to be a version of Mustang that drops in place of Have you considered writing an RFC to bridge rustc to this work? Not sure if this would be a target string or some other form of bridge, but it would be awesome to get some community feedback on the idea. Even if very incomplete, the entry level for a tier 3 target is very low. Even if it turns out out Mustang / Eyra won't be brought in-tree, it would be great to open up discussion about how we could do libc-free targets (as opposed to here where discussion is mostly if we may want to do this) |
@tgross35 It's my impression that |
@sunfishcode to answer you question: personally what lead me to explore a libc-free std was an attempt to reduce the number of targets my team's binary distribution had to support. We distribute binaries for pre libc 2.17 and I was hoping we could avoid abandoning libc 2.11 targets. As I'm sure you can guess, in exploring the practicality I realized:
Reasons I can think that folks would still want a libc-free std (aside from just "sounding cool")
FWIW I'd be much more interested in cargo-zigbuild getting pulled in than adding a libc-free target. |
@estk It sounds like the topic here isn't addressing what you need. Which is fine; I just want to be clear about it. Concerning your reasons folks might want a libc-free std:
|
I think a big part of that is because there isn't an actual plan, nobody really knows what the result would look like. What gets in-tree vs. external libraries, what is the entrypoint, do we do new target strings, how does this interact with LLVM, etc, are all questions that need to be answered. But it is tough to provide useful technical feedback about tradeoffs and expected work when there isn't something concrete to discuss. Which is what I think an RFC would help solve. That would propose a meaningful direction, get feedback about feasibility, and wind up with something that team members need to give an opinionated respond to. It will give a better idea about what amount of interest comes from coolness vs. actually solving technical problems. I really don't think anybody would say no to an experiment. The experiment just needs a definitive shape, as opposed to the handful of one-off proposals in this thread. Maybe a pre-RFC on IRLO could be a good starting point? |
Starting a big public complex technical discussion without any goals isn't really my cup of tea, but don't let me stop anyone 😉. |
Maybe other people think that way but those are not at all why I want a libc-free binary.
goal: be able to downgrade, upgrade, or screw up the libc of a VM and have a binary still work as intended I personally don't care all that much if it's integrated into rust official or not, I just want the capability. |
I'll be unsubscribing from this thread for the foreseeable future after this comment. One of the ways I imagine that I'll be able to tell that Eyra is ready to be proposed as an official Rust target is that there will be a different kind of conversation. It won't be about people trying to convince me of something. It will be about people using Eyra but maybe wanting it to be something more. Or, it'll be people who have tried Eyra and ran into trouble. Or they found it doesn't do something they need, and so on. Or maybe people who have a bigger idea of what Eyra could grow into. Or they have a real-world use case that they think Eyra could work for, and they have some questions. When those kinds of comments shape the conversation, that's when I'll know that this project makes sense. |
:( okay
I appreciate the clarity. In terms of focus, I think it makes sense. I'm subscribed to wasi stuff and see your name pop up all the time, so I know you do a lot of other really important work. |
e.g.
x86_64-linux
(orx86_64-unknown-linux-rust
).These would be the spiritual successor of steed: a standard library, free of C dependencies, for Linux systems. Steed implemented (or planned to) the Rust standard library using raw Linux system calls instead of libc's API. These libc-less targets would also implement
std
using raw system calls.Even though steed has been inactive for over a year people continue to star it on GitHub and currently has over 500 stars so it seems there's still interest for something like it.
What we learned during development is that maintaining a standard library out of tree is a lot of work because things get quickly out sync so if there's still interest for something like steed I would recommend writing an RFC to add support for libc-less targets (e.g.
x86_64-linux
) to rust-lang/rust; this would be equivalent to importing and developing steed as part of rust-lang/rust.An RFC is also a good way to (re-)evaluate the ROI of making a libc-less standard library. One of the goals of steed was hassle-free cross compilation but today that's solved by the Linux/MUSL targets +
rust-lld
(works on stable). The other goal was better optimizations (plain LTO doesn't optimize across FFI) but cross-language LTO is a now thing (-Z cross-lang-lto
). There may be less costly ways to achieve the same results.The rest of this post describes the status of steed as of 2017-10-20 (date of last commit).
What we had working:
std::io::{stdin,stderr,stdout}
)std::fs
)std::collections
) (but see (a) below)std::sync::Mutex
(courtesy ofparking_lot
)std::env
std::net
) (but see (d) below)#[test]
support (but see (c) below)You can check the examples directory to get an idea of what you could write with it.
What was missing:
a. a proper allocator. steed used a bump pointer allocator that never freed memory
b. all the math stuff (e.g.
f32::sin
). These days one can uselibm
.c. unwinding, all the steed targets used
-C panic=abort
d. hostname lookup
e.
errno
. It was unclear whether we should implement it or not. It seems to only be required bystd::io::Error::last_os_error
. Linux system calls have aResult
-like API soerrno
is not required to propagate errors from syscalls to thestd
API.and much more stuff; you can check the issue tracker for the full list.
cc @tbu- @briansmith
The text was updated successfully, but these errors were encountered: