Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use -0.0 in intrinsics::simd::reduce_add_unordered #130325

Merged
merged 1 commit into from
Sep 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions compiler/rustc_codegen_llvm/src/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2090,14 +2090,14 @@ fn generic_simd_intrinsic<'ll, 'tcx>(
};
}

arith_red!(simd_reduce_add_ordered: vector_reduce_add, vector_reduce_fadd, true, add, 0.0);
arith_red!(simd_reduce_add_ordered: vector_reduce_add, vector_reduce_fadd, true, add, -0.0);
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
arith_red!(simd_reduce_mul_ordered: vector_reduce_mul, vector_reduce_fmul, true, mul, 1.0);
arith_red!(
simd_reduce_add_unordered: vector_reduce_add,
vector_reduce_fadd_reassoc,
false,
add,
0.0
-0.0
);
arith_red!(
simd_reduce_mul_unordered: vector_reduce_mul,
Expand Down
29 changes: 29 additions & 0 deletions tests/assembly/simd/reduce-fadd-unordered.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
//@ revisions: x86_64 aarch64
//@ assembly-output: emit-asm
//@ compile-flags: --crate-type=lib -O
//@[aarch64] only-aarch64
//@[x86_64] only-x86_64
//@[x86_64] compile-flags: -Ctarget-feature=+sse3
#![feature(portable_simd)]
#![feature(core_intrinsics)]
use std::intrinsics::simd as intrinsics;
use std::simd::*;
// Regression test for https://github.com/rust-lang/rust/issues/130028
// This intrinsic produces much worse code if you use +0.0 instead of -0.0 because
// +0.0 isn't as easy to algebraically reassociate, even using LLVM's reassoc attribute!
// It would emit about an extra fadd, depending on the architecture.

// CHECK-LABEL: reduce_fadd_negative_zero
pub unsafe fn reduce_fadd_negative_zero(v: f32x4) -> f32 {
// x86_64: addps
// x86_64-NEXT: movshdup
// x86_64-NEXT: addss
// x86_64-NOT: xorps

// aarch64: faddp
// aarch64-NEXT: faddp

// CHECK-NOT: {{f?}}add{{p?s*}}
// CHECK: ret
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to mitigate LVI vulnerabilities, ret instructions are rewritten as popq %rax; lfence; jmpq *rax on the x86_64-fortanix-unknown-sgx target. So this test currently fails on this platform. Can this test be ignored for the SGX target, please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of thing is a constant issue with the SGX target. Please PR compiletest with an appropriate modification that handles this issue globally without having to modify each and every single test with "oh yeah, and SGX is special, as usual".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize I have probably indulged you significantly in the past and I wish to be clear, I do appreciate that Fortanix actually runs the tests in their CI, unlike some, but I must at this point refer to the target tier policy:

Tier 2 targets must not impose burden on the authors of pull requests, or other developers in the community, to ensure that tests pass for the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on tests failing for the target. Do not send automated messages or notifications (via any medium, including via @) to a PR author or others involved with a PR regarding the PR breaking tests on a tier 2 target, unless they have opted into such messages.

Like we really need proper turnkey cross-compile testing support per #130375 or even just an SGX exception built in to compiletest or something, twiddling every single test isn't really sustainable for you or for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what the appropriate compiletest modification here would be -- unless you are suggesting to skip all assembly tests on SGX?

Assembly tests in general are very finicky, you're lucky if they merge in less than 3 cycles, for one reason or another.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A flat skip wouldn't work for them, a nonzero number of assembly tests are specifically for SGX-related codegen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, yeah, and that requirement has to go away, which is why I opened #130375

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you not dereference the pointer the first time I linked it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only-x86_64 should exlude SGX, and we have a separate only-sgx for tests that want to run on SGX?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having only-x86_64 exclude SGX is odd, as SGX is only present on x86_64 platforms. Hence the platform is x86_64-fortanix-unknown-sgx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#130375 is an interesting approach, but I don't see how this would avoid issues like the SGX special case. There isn't a special flag you need to add for the test to succeed on SGX. It's the test itself that causes issues. For most of the exceptions we currently have for SGX, they're there because of the test using CHECK: ret just to denote that they want to find the end of the function. This usually works pretty well because most assembly languages across platforms have identical instructions.

intrinsics::simd_reduce_add_unordered(v)
}
Loading