-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enable atomic loads and stores for all RISC-V targets #98333
Conversation
This roughly reverts PR rust-lang#66548 Atomic "CAS" are still disabled for targets without the *“A” Standard Extension for Atomic Instructions*. However this extension only adds instructions for operations more complex than simple loads and stores, which are always atomic when aligned. In the [Unprivileged Spec v. 20191213](https://riscv.org/technical/specifications/) section 2.6 *Load and Store Instructions* of chapter 2 *RV32I Base Integer Instruction Set* (emphasis mine): > Even when misaligned loads and stores complete successfully, > these accesses might run extremely slowly depending on the implementation > (e.g., when implemented via an invisible trap). Further-more, whereas > **naturally aligned loads and stores are guaranteed to execute atomically**, > misaligned loads and stores might not, and hence require > additional synchronization to ensure atomicity. Unfortunately PR rust-lang#66548 did not provide much details on the bug that motivated it, but rust-lang#66240 and rust-lang#85736 appear related and happen with targets that do have the A extension.
r? @nagisa (rust-highfive has picked a reviewer for you, use r? to override) |
|
I don’t know if this is relevant or what LLVM does there, but the A extension has a limitation in that single-instruction atomic swap/add/and/or/xor/min/max only exists for 32-bit or (on RV64) 64-bit values. Similar operations on 8-bit or 16-bit values would have to be a loop based on 32-bit load-reserved and store-conditional. I think this is the kind of loop |
I don't think changing the More details in here: #81752 (comment). |
As an analogy, this should be the same status as It seems as though this may be handled differently upstream in LLVM for RV32I targets, though that seems like something that should be addressed upstream for the sake of consistency. |
Trying this on a somewhat(?) simple project fails: error: linking with `rust-lld` failed: exit status: 1
|
= note: "rust-lld" "-flavor" "gnu" "/tmp/rustcZzSyCr/symbols.o" […]
= note: rust-lld: error: undefined symbol: __atomic_load_4
>>> referenced by compiler_builtins.5931568b-cgu.83
>>> compiler_builtins-35b1260b00e1afde.compiler_builtins.5931568b-cgu.83.rcgu.o:(compiler_builtins::mem::memcpy::h4f55b8ec9004b8fa) in archive /home/simon/projects/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/riscv32i-unknown-none-elf/lib/libcompiler_builtins-35b1260b00e1afde.rlib Which looks like #85736 |
I think that is the intended behaviour, even though we're setting the Along with this PR, we need to convince upstream LLVM that adding atomic load/stores for RISCVI targets is safe. The only way we can do that is if we can convey to LLVM that mixing true atomic load/stores with locking atomic CAS implementations is not possible. I think this is only possible under two conditions:
or the cas implementation is known to be lock-free The second seems quite difficult to prove, but the first one is easy. We just need a way of exposing this information to LLVM. |
I suppose another option is to implement the libcalls in software, inside https://github.com/rust-lang/compiler-builtins, but that would require convincing them to change their current stance which is
We would also have to abide by the rules stated above (not mixing real atomics with locking atomic implementations) but perhaps this is easier to verify on the Rust side? @Amanieu any thoughts on this? |
The As a work around, this generates on both pub struct MyAtomicU8(UnsafeCell<u8>);
unsafe impl Sync for MyAtomicU8 {}
impl MyAtomicU8 {
pub fn store(&self, value: u8, ordering: Ordering) {
fence(ordering);
unsafe { self.0.get().write(value) }
}
pub fn load(&self, ordering: Ordering) -> u8 {
let value = unsafe { self.0.get().read() };
fence(ordering);
value
}
} Does it look correct? |
It does not look correct:
AFAIK, the only sound way with pure Rust currently available is to use inline assembly like my portable-atomic crate does. (In this particular case, volatile read/write + fence probably work too, but is also not sound because volatile read/write are not guaranteed to be atomic. Actualy, it seems LLVM uses instructions that are not guaranteed to be atomic in aarch64's volatile read/write.) |
This is very helpful, thanks @taiki-e! |
Overall I'm in favor of this change, but it requires either one of the following to happen:
I'm happy with either approach.
No, atomic fences also enforce ordering on non-atomic operations. |
Thanks for weighing in @Amanieu! The LLVM changes are a bit out of my depth, but I could PR some To expand on my earlier comment about mixing lock-free load/stores with locking cas implementations, see this LLVM discussion thread: https://reviews.llvm.org/D47553. How do you think we should avoid that? Only provide libcall implementations if |
Looks like libcore uses |
The fundamental issue is summary in the last comment here: this is unsound when mixing locked-based CAS operations with lock-free load/store operations. While this doesn't apply to Rust since we don't expose CAS operations on this target, we still shouldn't override the A better solution might be to explicitly lower atomic load/store to a volatile load/store + a fence in rustc or in the standard library. |
r? @Amanieu |
LLVM seems to be moving away from supporting atomic load/store independently of full atomic support. We need to make a decision of what to support in Rust: #99595 (comment) |
Hey @Amanieu, what would be the best way to open up a meta issue and get the embedded-wg involved? It seems like this is a change to llvm, and it might impact many embedded targets, potentially in a breaking way if atomics with just load/stores are removed for stable targets. |
☀️ Test successful - checks-actions |
@Amanieu I believe we need to also enable And, |
Finished benchmarking commit (90f0b24): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 650.501s -> 650.303s (-0.03%) |
While this PR may be reverted, the performance results are noise, since it only modified RISC-V target specs, so: @rustbot label: +perf-regression-triaged |
Ok, I can confirm this PR break the build for these targets with "undefined symbol" error. |
rust-lang/rust#98333 broke RISC-V targets without A-extension. This will be fixed by rust-lang/rust#114497 or rust-lang/rust#114499. ``` = note: rust-lld: error: undefined symbol: __atomic_load_4 >>> referenced by mod.rs:1242 (/rustc/eb088b8b9d98f1af1b0e61bbdcd8686e1b0db7b6/library/core/src/num/mod.rs:1242) >>> compiler_builtins-d066fd6ed508b6b5.compiler_builtins.b1b28d926042a9f7-cgu.004.rcgu.o:(compiler_builtins::mem::memcpy::he6d5500b219c1d3d) in archive /home/runner/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib/libcompiler_builtins-d066fd6ed508b6b5.rlib ```
rust-lang/rust#98333 broke RISC-V targets without A-extension. This will be fixed by rust-lang/rust#114497 or rust-lang/rust#114499. ``` = note: rust-lld: error: undefined symbol: __atomic_load_4 >>> referenced by mod.rs:1242 (/rustc/eb088b8b9d98f1af1b0e61bbdcd8686e1b0db7b6/library/core/src/num/mod.rs:1242) >>> compiler_builtins-d066fd6ed508b6b5.compiler_builtins.b1b28d926042a9f7-cgu.004.rcgu.o:(compiler_builtins::mem::memcpy::he6d5500b219c1d3d) in archive /home/runner/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib/libcompiler_builtins-d066fd6ed508b6b5.rlib ```
rust-lang/rust#98333 broke RISC-V targets without A-extension. This will be fixed by rust-lang/rust#114497 or rust-lang/rust#114499. ``` = note: rust-lld: error: undefined symbol: __atomic_load_4 >>> referenced by uint_macros.rs:1230 (/home/runner/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/num/uint_macros.rs:1230) >>> compiler_builtins-a15f77f0f647aa99.compiler_builtins.eedcbccd0d1b9b88-cgu.1.rcgu.o:(compiler_builtins::mem::memcpy::hedd00e0c59d2a943) in archive /home/runner/work/portable-atomic/portable-atomic/target/riscv32im-unknown-none-elf/debug/deps/libcompiler_builtins-a15f77f0f647aa99.rlib ```
…nieu Revert rust-lang#98333 "Re-enable atomic loads and stores for all RISC-V targets" This reverts rust-lang#98333. As said in rust-lang#98333 (comment), `forced-atomics` target feature is also needed to enable atomic load/store on these targets (otherwise, libcalls are generated): https://godbolt.org/z/433qeG7vd However, `forced-atomics` target feature is currently broken (rust-lang#114153), so AFAIK, there is currently no way to enable atomic load/store (via core::intrinsics) on these targets properly. r? `@Amanieu`
…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#114376 (Avoid exporting __rust_alloc_error_handler_should_panic more than once.) - rust-lang#114413 (Warn when #[macro_export] is applied on decl macros) - rust-lang#114497 (Revert rust-lang#98333 "Re-enable atomic loads and stores for all RISC-V targets") - rust-lang#114500 (Remove arm crypto target feature) - rust-lang#114566 (Store the laziness of type aliases in their `DefKind`) - rust-lang#114594 (Structurally normalize weak and inherent in new solver) - rust-lang#114596 (Rename method in `opt-dist`) r? `@ghost` `@rustbot` modify labels: rollup
Pass +forced-atomics feature for riscv32{i,im,imc}-unknown-none-elf As said in rust-lang#98333 (comment), `forced-atomics` target feature is also needed to enable atomic load/store on these targets (otherwise, libcalls are generated): https://godbolt.org/z/433qeG7vd ~~This PR is currently marked as a draft because:~~ - ~~`forced-atomics` target feature is currently broken (rust-lang#114153 EDIT: Fixed - ~~`forced-atomics` target feature has been added in LLVM 16 (llvm/llvm-project@f5ed0cb), but the current minimum LLVM version [is 15](https://github.com/rust-lang/rust/blob/90f0b24ad3e7fc0dc0e419c9da30d74629cd5736/src/bootstrap/llvm.rs#L557). In LLVM 15, the atomic load/store of these targets generates libcalls anyway.~~ EDIT: LLVM 15 has been dropped Depending on the policy on the minimum LLVM version for these targets, this may be blocked until the minimum LLVM version is increased to 16. r? `@Amanieu`
Pass +forced-atomics feature for riscv32{i,im,imc}-unknown-none-elf As said in rust-lang#98333 (comment), `forced-atomics` target feature is also needed to enable atomic load/store on these targets (otherwise, libcalls are generated): https://godbolt.org/z/433qeG7vd ~~This PR is currently marked as a draft because:~~ - ~~`forced-atomics` target feature is currently broken (rust-lang#114153 EDIT: Fixed - ~~`forced-atomics` target feature has been added in LLVM 16 (llvm/llvm-project@f5ed0cb), but the current minimum LLVM version [is 15](https://github.com/rust-lang/rust/blob/90f0b24ad3e7fc0dc0e419c9da30d74629cd5736/src/bootstrap/llvm.rs#L557). In LLVM 15, the atomic load/store of these targets generates libcalls anyway.~~ EDIT: LLVM 15 has been dropped Depending on the policy on the minimum LLVM version for these targets, this may be blocked until the minimum LLVM version is increased to 16. r? `@Amanieu`
Pass +forced-atomics feature for riscv32{i,im,imc}-unknown-none-elf As said in rust-lang#98333 (comment), `forced-atomics` target feature is also needed to enable atomic load/store on these targets (otherwise, libcalls are generated): https://godbolt.org/z/433qeG7vd ~~This PR is currently marked as a draft because:~~ - ~~`forced-atomics` target feature is currently broken (rust-lang#114153 EDIT: Fixed - ~~`forced-atomics` target feature has been added in LLVM 16 (llvm/llvm-project@f5ed0cb), but the current minimum LLVM version [is 15](https://github.com/rust-lang/rust/blob/90f0b24ad3e7fc0dc0e419c9da30d74629cd5736/src/bootstrap/llvm.rs#L557). In LLVM 15, the atomic load/store of these targets generates libcalls anyway.~~ EDIT: LLVM 15 has been dropped Depending on the policy on the minimum LLVM version for these targets, this may be blocked until the minimum LLVM version is increased to 16. r? `@Amanieu`
Pass +forced-atomics feature for riscv32{i,im,imc}-unknown-none-elf As said in rust-lang/rust#98333 (comment), `forced-atomics` target feature is also needed to enable atomic load/store on these targets (otherwise, libcalls are generated): https://godbolt.org/z/433qeG7vd ~~This PR is currently marked as a draft because:~~ - ~~`forced-atomics` target feature is currently broken (rust-lang/rust#114153 EDIT: Fixed - ~~`forced-atomics` target feature has been added in LLVM 16 (llvm/llvm-project@f5ed0cb), but the current minimum LLVM version [is 15](https://github.com/rust-lang/rust/blob/90f0b24ad3e7fc0dc0e419c9da30d74629cd5736/src/bootstrap/llvm.rs#L557). In LLVM 15, the atomic load/store of these targets generates libcalls anyway.~~ EDIT: LLVM 15 has been dropped Depending on the policy on the minimum LLVM version for these targets, this may be blocked until the minimum LLVM version is increased to 16. r? `@Amanieu`
This roughly reverts PR #66548
Atomic "CAS" are still disabled for targets without the “A” Standard Extension for Atomic Instructions. However this extension only adds instructions for operations more complex than simple loads and stores, which are always atomic when aligned.
In the Unprivileged Spec v. 20191213 section 2.6 Load and Store Instructions of chapter 2 RV32I Base Integer Instruction Set (emphasis mine):
Unfortunately PR #66548 did not provide much details on the bug that motivated it, but #66240 and #85736 appear related and happen with targets that do have the A extension.