Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACP: add bool::select #468

Closed
orlp opened this issue Oct 22, 2024 · 23 comments
Closed

ACP: add bool::select #468

orlp opened this issue Oct 22, 2024 · 23 comments
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@orlp
Copy link

orlp commented Oct 22, 2024

Proposal

Problem statement

It is a common theme in performance-oriented programming to want to select between two outcomes, but without changing which code runs as this invokes branch prediction. Branch prediction is great when it works but has two downsides:

  1. It can be significantly slower than branchless code if the branch is unpredictable, like in sorting random data.
  2. It can have inconsistent performance (even if faster on average), which can be a deal-breaker in latency-sensitive applications.

Currently in Rust if you want to have a good chance of something being branchless you have to write your code in a particular way, and even then it regularly fails to compile to branchless instructions, depending on the mood of the compiler.

Motivating examples or use cases

There are tons of examples, I don't think I have to convince anyone familiar with performance-sensitive code of its use. Binary searches, sorting, filtering of data based on a condition, the applications are nearly endless. In performance-sensitive code, if you write an if where both sides of the if are cheap to evaluate and the condition is not 99%+ consistent there's a good chance a branchless select statement would be faster.

And of course there is the consistency argument even if the branch is 99%+ consistent in most cases, for example audio DSP code might want to ensure a kernel is always consistent in its speed to prevent audio crackling when a rare sequence of unpredictable samples come along, or high-frequency trading code may want to ensure it always has a response in a certain amount of nanoseconds.

Solution sketch

I propose we add the following method to the bool primitive type:

/// Selects one of two values based on the value of self.
///
/// Unlike an `if` statement both arguments are always evaluated. The compiler
/// will try to emit machine code that avoids control flow for this selection,
/// taking the same amount of time regardless of whether self is true or false.
const fn select<T>(self, true_val: T, false_val: T) -> T;

This interface is similar to the std::simd::Mask::select API, just for one value.

Alternatives

There is the select_unpredictable intrinsic. However, it is an intrinsic so it should not really be used. There is no viable alternative, even writing things such as [false_val, true_val][cond as usize] can and do get rewritten by the compiler to branches, or just generate plain terrible code. One can reach for inline assembly but this pessimizes surrounding code and is not portable.

As for bikeshedding about the name, I considered the following alternatives, in order of preference:

  • select_predicated (see predication)
  • select_eager
  • select_branchless
  • select_unpredictable (as the intrinsic was named)

However, I prefer just the simple select. I think the fact it always eagerly evaluates both arguments (as functions do) is already self-documenting enough to readers to show it is not a control flow structure, but rather a selection operator between two pre-computed values. I think it is fairly natural that this then in turn compiles to a branchless select, even when simply reading cond.select(a, b).

My reason for not favoring the current name of the intrinsic select_unpredictable is that (in my opinion) it confuses a problem with the solution. This can lead to confusing scenarios where the solution is applicable but the problem is not, e.g. someone could make an informed choice to use select_unpredictable while the boolean condition is in fact predictable - they just wanted consistency or avoid conditional jumps for other reasons.

Links and related work

@orlp orlp added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 22, 2024
@the8472
Copy link
Member

the8472 commented Oct 22, 2024

This shouldn't go on the bool type itself. It's more something fit for the std::hint module

@orlp
Copy link
Author

orlp commented Oct 22, 2024

@the8472 It's not a hint, it's an actual computation - selecting between two values. In fact if you read my ACP a bit more carefully you'll notice I'm trying to move away from the concept of 'unpredictability' in the API completely, which is the only thing that could be hinted about in the first place.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 22, 2024

std::hint is not std::intrinsics by a different name, and this is not quite proposing to expose the LLVMIR metadata that select_unpredictable emits directly.

@chorman0773
Copy link

I'd assume this would be allowed to generate a branch if no other codegen option exists.

@ChrisDenton
Copy link
Member

It does seem a bit weird to have this as a method on bool. Maybe some code examples would change my mind but it feels too niche to not be a separate function or a wrapper type.

@orlp
Copy link
Author

orlp commented Oct 22, 2024

I'd assume this would be allowed to generate a branch if no other codegen option exists.

Yes, the description should probably be expanded to explicitly mention this isn't a hard guarantee (even if compiling on a platform that has, e.g. cmov) and that it should not be relied upon for e.g. constant-time cryptography code.

@the8472
Copy link
Member

the8472 commented Oct 22, 2024

Which then sounds like merely a hint to codegen...

@workingjubilee
Copy link
Member

in the same sense that unchecked_add is, I suppose?

@the8472
Copy link
Member

the8472 commented Oct 22, 2024

You have a point. But I'm not convinced yet. Unlike in SIMD llvm does have optimizations that turns cmovs into branches. So there's a difference between computing two values, picking one and then maybe the compiler transforms it depending on what it thinks is profitable and explicitly telling the compiler that you know better.
simd Mask::select doesn't have that because simd generally doesn't do branch prediction (maybe some stuff inside GPUs does?)

@chorman0773
Copy link

None of the functions in core::hint have any actual runtime behaviour (unreachable_unchecked and assert_unchecked are either UB or no-ops, black_box is an identity function, and spin_loop is a no-op). bool::select would have actual behaviour, so it definitely doesn't belong in core::hint.

@kennytm
Copy link
Member

kennytm commented Oct 23, 2024

The proposed API does not fit into std::hint. However one could introduce it actually like a hint as

if std::hint::unpredictable(boolean) {
    true_value
} else {
    false_value
}

This should be "optimized" to core::intrinsics::select_unpredicatable if both true_value and false_value can be inlined and side-effect-free so that evaluating both branches is beneficial1. This mirrors clang's __builtin_unpredictable intrinsic. Additionally, like __builtin_unpredictable it can be extended to support

match std::hint::unpredictable(integer) {
    0 => a,
    1 => b,
    3 => c,
    9 => d,
    27 => e,
    _ => etc,
}

OTOH in rust-lang/rust#120370 core::intrinsics::{likely, unlikely} are being reimplemented in terms of cold_path() because the intrinsics were broken, departing from hinting the condition and instead hinting the branches.

According to the LLVM docs the !unpredictable metadata are supposed to "be attached to any branch or switch instruction", so it may be better aligned with LLVM to spell it as

#[unpredictable]
if boolean { true_value } else { false_value }

#[unpredictable]
match integer { _ => etc }

Footnotes

  1. unfortunately LLVM will lose the !unpredictable metadata if we simply define std::hint::unpredictable(boolean) as core::intrinsics::select_unpredicatable(boolean, true, false), but this may be just an LLVM bug.

@orlp
Copy link
Author

orlp commented Oct 23, 2024

@kennytm If true_value or false_value have side-effects such as panics (which could be implicit, e.g. writing v[0]) or have invisible potential side-effects since it contains a non-inlined function call, you'd need to write

let true_value = true_expr;
let false_value = false_expr;
#[unpredictable]
if boolean { true_value } else { false_value }

otherwise the Rust compiler is not allowed to write it using eager evaluation + a cmov/csel.

I'd really much rather just write

boolean.select(true_expr, false_expr)

which much closer matches my intent. The whole point of select and branchless programming in general is that there is no control flow (if possible), and a construct using if is very much the wrong way to express this in my opinion.

The fact that the API bool::select always eagerly evaluates both arguments is by-design, and crucial. That is the fundamental property of control-flow free programming: evaluate both options and use indexing, arithmetic, cmov, csel, etc to select the result you want. It's not just a hint, it's a programming paradigm.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 23, 2024

@kennytm

According to the LLVM docs the !unpredictable metadata are supposed to "be attached to any branch or switch instruction", so it may be better aligned with LLVM to spell it as

That documentation is incorrect and outdated. It does in fact work on select. There is considerable code to propagate it and numerous tests to assure it survives the infamous "cmov-to-branch" pass. It is possible the test coverage is not comprehensive enough, but that's compiler bugs for you.

@kennytm
Copy link
Member

kennytm commented Oct 23, 2024

@workingjubilee I didn't mean !unpredictable did not work on select, I mean !unpredictable is applied on an IR instruction rather than the operator, so #[unpredictable] if a { b } else { c } and a.select(b, c) are closer to what LLVM emits than if unpredictable(a) { b } else { c }.

The bug (?) I mentioned in the footnote is that

#![feature(core_intrinsics)]
pub fn select_unpredictable(b: bool, t: u32, f: u32) -> u32 {
    if std::intrinsics::select_unpredictable(b, true, false) {
        t
    } else {
        f
    }
}

produces

  %t.f = select i1 %b, i32 %t, i32 %f
  ret i32 %t.f

instead of this when you use the intrinsic directly

  %0 = select i1 %b, i32 %t, i32 %f, !unpredictable !3
  ret i32 %0

@workingjubilee
Copy link
Member

I meant the select LLVMIR instruction.

Did you verify !unpredictable disappears using Godbolt's LLVMIR view of Rust code?

I have found evidence to suggest it is simply incorrect.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 23, 2024

Oh. If you're just introducing the knowledge that trying to reiterate clang's implementation of __builtin_unpredictable (which resembles the intrinsics::unlikely API) is horribly broken, that is true. I am just confused as to why you would mention the incorrect documentation, then.

@kennytm
Copy link
Member

kennytm commented Oct 23, 2024

@workingjubilee if https://llvm.org/docs/LangRef.html is not correct then why it is not fixed 😅, the title literally said "LLVM 20.0.0git documentation" and reachable from https://llvm.org/docs/Reference.html#llvm-irhttps://llvm.org/docs/#documentationhttps://llvm.org/. Also I don't see how that phrase I've quoted implies attaching !unpredictable to select is prohibited.

godbolt for reference.

@Voultapher
Copy link

For what it's worth, this API gets a +1 from me. Efficient programs need ways to avoid expensive branch mispredicts. Generally branchless - or less confusingly named jumpless - programming is achieved via a small toolkit of building blocks:

  • Conditional increment e.g. num_x += cond as usize, generates something like inc or adc.
  • Perform both sides of the branch unconditionally and mask or ignore unwanted side.
  • Value selection e.g. let x = if cond { a } else { b } generates something like cmov or setb with luck.
  • Numeric logic, xor, bit and etc. only applicable for offsets or if the data handling isn't generic.
  • SIMD masking.

Out of this non-exhaustive list of options one relies particularly heavy on compiler co-operation, branchless value selection.

@scottmcm
Copy link
Member

scottmcm commented Nov 5, 2024

I wonder if it'd be worth adding both select and select_unpredictable, the same way that LLVM doesn't say that all selects are necessarily unpredictable.

I wouldn't mind using select to emphasize that I'm intentionally calculating the values in both paths, even if I'm fine letting LLVM (and PGO) decide whether it's really best as jumps or as conditional moves.

And the version with the most constraints on codegen having a longer name is probably good.

(And I'm also tempted to add it to MIR, because needing 4 basic blocks for every ?:-equivalent is wasteful, and a version that doesn't have different implications from the branching would be nice so that optimizations could optimize let z = if b { x } else { y }; to a select without forcing !unpredictable. That doesn't need a public API, though, of course.)

@Voultapher
Copy link

Voultapher commented Nov 5, 2024

@scottmcm I had a similar thought as well. There are different scenarios and and either select or select_unpredictable by themselves aren't necessarily a good fit for all of them. In theory we could add only select_unpredictable, or called select with the LLVM flags for unpredictable, and have if cond { a } else { b} favor generating branchy code, but I suspect that would break a substantial amount of existing optimizations, so probably best to go with a new opt-in API and leave the current code-gen untouched.

When a programmer writes if ... it's not always clear what the intent is, and without PGO the compiler has to guess the mispredict penalty. LLVM tends to prefer branchless code while gcc is less aggressive about making such optimization. For example here https://cpp.godbolt.org/z/esxsdnvxf clang optimizes both functions to use cmov while gcc decides to use cmov in one case and a branch in the other. So giving developers a chance to express their intent in both cases seems like a good idea to me.

@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting.

We'd like to see this called select_unpredictable, with the justification that this would be a pessimization that you shouldn't use if your condition is predictable.

Otherwise: 👍 🚢

@joshtriplett joshtriplett added the ACP-accepted API Change Proposal is accepted (seconded with no objections) label Nov 5, 2024
@Amanieu
Copy link
Member

Amanieu commented Nov 5, 2024

Right, you don't want to be using this unless you specifically want to tell the compiler that the condition is unpredictable by a branch predictor. Otherwise it is better to let LLVM's heuristics decide whether to use a branch or conditional move.

We also rejected other names such as select_branchless because we don't actually guarantee any particular compiler output (e.g. platforms without conditional move instructions). Instead, we only give a hint that the condition is likely to be unpredictable to a branch predictor.

@joboet
Copy link
Member

joboet commented Dec 6, 2024

I've implemented this in rust-lang/rust#133964.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 4, 2025
…oss35

core: implement `bool::select_unpredictable`

Tracking issue: rust-lang#133962
ACP: rust-lang/libs-team#468
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jan 4, 2025
Rollup merge of rust-lang#133964 - joboet:select_unpredictable, r=tgross35

core: implement `bool::select_unpredictable`

Tracking issue: rust-lang#133962
ACP: rust-lang/libs-team#468
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests