Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement LLVM's elementwise unordered atomic memory intrinsics #311

Merged
merged 1 commit into from
Aug 22, 2019

Conversation

Ralith
Copy link
Contributor

@Ralith Ralith commented Aug 20, 2019

Allows uses of intrinsics of the form llvm.(memcpy|memmove|memset).element.unordered.atomic.* to be linked. Unblocks rust-lang/rust#59155, rust-lang/rust#58599.

These functions are not based on an existing implementation in compiler-rt, because one does not appear to exist. Reviewing their introduction, this appears to be because existing users of these intrinsics are not compiler-rt users. Fortunately the semantics are simple, so I went ahead and implemented them as the trivial unordered-atomic-ification of the corresponding non-atomic operations.

LLVM calls these functions with intptrs rather, so I'm taking a minor liberty by representing them with accurate pointer types. If that's incorrect, it's easy to switch to explicit casts. I'm also assuming that rustc will always generate calls with usize for the third argument, which is true for rust-lang/rust#59155.

CC @joshlf

@Ralith
Copy link
Contributor Author

Ralith commented Aug 20, 2019

One thing I'm unclear on is whether this needs to account for targets that don't support a particular width of atomic operation by e.g. cfging things out. I don't see any evidence of that being done elsewhere, but there's not many examples to reference.

CI seems to be failing with linker errors like unresolved external symbol _ZN4core9panicking5panic17h6e86172dee32b6faE referenced in function _ZN4core6option15Option$LT$T$GT$6unwrap17hb3f9b8d564ddbf0cE, which is odd, as I don't think I've introduced anything more able to panic than existing code, or involving Option at all.

src/mem.rs Outdated Show resolved Hide resolved
@alexcrichton
Copy link
Member

The implementations here all look good to me, although I'm presuming this is all pretty experimental so I'm not necessarily reviewing too too closely. As for the CI errors it looks like this is related to possible panics in debug mode, although I also don't really know how they'd arise. I'd recommend doing what CI is doing (running the build then looking at the rlib) and trying to trace back from there to see why panics are included.

@Centril
Copy link

Centril commented Aug 21, 2019

cc also @RalfJung

@Ralith Ralith force-pushed the elementwise-unordered-atomics branch from 5b5bc58 to dbcc979 Compare August 21, 2019 17:53
@Ralith
Copy link
Contributor Author

Ralith commented Aug 21, 2019

Fixed CI by replacing division with intrinsics::unchecked_div and a for loop with a while loop.

src/mem.rs Outdated Show resolved Hide resolved
@alexcrichton
Copy link
Member

Interesting! Was both unchecked_div and the for loop required or just the for loop?

Allows uses of intrinsics of the form
llvm.(memcpy|memmove|memset).element.unordered.atomic.* to be linked.
@Ralith Ralith force-pushed the elementwise-unordered-atomics branch from dbcc979 to 997b86d Compare August 22, 2019 01:23
@Ralith
Copy link
Contributor Author

Ralith commented Aug 22, 2019

Both are definitely necessary.

@alexcrichton alexcrichton merged commit bc57f53 into rust-lang:master Aug 22, 2019
@alexcrichton
Copy link
Member

Thanks!

@Ralith Ralith deleted the elementwise-unordered-atomics branch August 22, 2019 23:02

fn memset_element_unordered_atomic<T>(s: *mut T, c: u8, bytes: usize)
where
T: Copy + From<u8> + Shl<u32, Output = T> + BitOr<T, Output = T>,
Copy link
Member

@RalfJung RalfJung Aug 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd, seems worth adding a comment explaining how exactly the value x being written is computed? Seems like you are making assumptions about the trait implementations here.

What is the reason this does not match the normal memset, which works on *mut u8?

Copy link
Contributor Author

@Ralith Ralith Aug 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that we must atomically store an entire T at a time, which would be impossible if we worked byte-at-a-time. Instead, we build up a T made out of repeated cs, and atomically store that instead. What I'd really have liked here is T: PrimitiveInteger, but of course we don't have that.

Fortunately since this is not exported we don't need to worry too much about someone passing something gratuitously weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the spec says that e.g. the setting happens in 4-byte-atomic chunks?

LLVM is allowed to transform adjacent subsequent atomic operations into a single one. Basically what you are doing here is realizing that optimization by hand.

What I'd really have liked here is T: PrimitiveInteger, but of course we don't have that.

Fair. Could you add a comment explaining that?
I also realized this function is private, so we actually know it will be called only with unsigned primitive integer types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the spec says that e.g. the setting happens in 4-byte-atomic chunks?

Yes, writing each complete T atomically is what distinguishes this from a regular memset.

LLVM is allowed to transform adjacent subsequent atomic operations into a single one. Basically what you are doing here is realizing that optimization by hand.

Right, but we need to guarantee this happens for correctness, so relying on an optimization wouldn't be appropriate.


fn memcpy_element_unordered_atomic<T: Copy>(dest: *mut T, src: *const T, bytes: usize) {
unsafe {
let n = unchecked_div(bytes, mem::size_of::<T>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume that bytes is a multiple of size_of::<T>? If not, what happens when there is rounding going on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you could use https://doc.rust-lang.org/nightly/std/intrinsics/fn.exact_div.html then. Also IMO this is a precondition worth stating in a comment here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, didn't see that one. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Resolved in #312.)

@eddyb
Copy link
Member

eddyb commented Apr 10, 2020

Would it be possible to gate this on LLVM version? System LLVM 7 and LLVM 8 versions don't seem to allow building Rust's standard library for a i686-unknown-linux-gnu target, e.g. LLVM 8 fails with:

 cmpxchg instructions must be atomic.
  %10 = cmpxchg i64* %9, i64 0, i64 0 unordered not_atomic
in function __llvm_memcpy_element_unordered_atomic_8
LLVM ERROR: Broken function found, compilation aborted!
error: could not compile `compiler_builtins`.

And LLVM 7 just segfaults, which made rust-lang/rust#70989 confusing to debug (I initially assumed rustc was getting miscompiled, but that doesn't seem likely and the LLVM 8 failure explains it).

I will likely attempt to find a target that doesn't have this problem, since it seems easier if possible.

@eddyb
Copy link
Member

eddyb commented Apr 10, 2020

i686-unknown-linux-gnu has:

    base.max_atomic_width = Some(64);

But i586-unknown-linux-gnu doesn't have that, so there's hope still.

EDIT: nope, that's not enough... rust-lang/rust#70989 (comment) still segfaulted.

EDIT2: yeah okay I completely missed this part (so it inherits 64-bit atomics):

let mut base = super::i686_unknown_linux_gnu::target()?;

EDIT3: armv5te-unknown-linux-gnueabi has:

            max_atomic_width: Some(32),

and I've found through testing that building libstd for it works with LLVM 7.

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 13, 2020
…crum

 ci: run mir-opt tests on PR CI also as 32-bit (for `EMIT_MIR_FOR_EACH_BIT_WIDTH`).

Background: rust-lang#69916 and [`src/test/mir-opt/README.md`](https://github.com/rust-lang/rust/blob/master/src/test/mir-opt/README.md):
> By default 32 bit and 64 bit targets use the same dump files, which can be problematic in the
presence of pointers in constants or other bit width dependent things. In that case you can add
>
> ```
> // EMIT_MIR_FOR_EACH_BIT_WIDTH
> ```
>
> to your test, causing separate files to be generated for 32bit and 64bit systems.

However, if you change the output of such a test (intentionally or not), or if you add a test and it varies between 32-bit and 64-bit platforms, you have to run this command (for a x64 linux host):
`./x.py test --stage 1 --target x86_64-unknown-linux-gnu --target i686-unknown-linux-gnu --bless  src/test/mir-opt`

Otherwise, bors trying to merge the PR will fail, since we test 32-bit targets there.
But we don't on PR CI, which means there's no way the PR author would know (unless they were burnt by this already and know what to look for).

This PR resolves that by running `mir-opt` tests for ~~`i686-unknown-linux-gnu`~~, on PR CI.
**EDIT**: switched to `armv5te-unknown-linux-gnueabi` to work around LLVM 7 crashes (see rust-lang/compiler-builtins#311 (comment)), found during testing.

cc @rust-lang/wg-mir-opt @rust-lang/infra
@RalfJung
Copy link
Member

RalfJung commented May 9, 2020

System LLVM 7 and LLVM 8 versions don't seem to allow building Rust's standard library for a i686-unknown-linux-gnu target

Even if we have a work-around, isn't that still a pretty serious bug?

@RalfJung
Copy link
Member

Opened issue: #356

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants