ACP: add bool::select #468

orlp · 2024-10-22T23:00:10Z

Proposal

Problem statement

It is a common theme in performance-oriented programming to want to select between two outcomes, but without changing which code runs as this invokes branch prediction. Branch prediction is great when it works but has two downsides:

It can be significantly slower than branchless code if the branch is unpredictable, like in sorting random data.
It can have inconsistent performance (even if faster on average), which can be a deal-breaker in latency-sensitive applications.

Currently in Rust if you want to have a good chance of something being branchless you have to write your code in a particular way, and even then it regularly fails to compile to branchless instructions, depending on the mood of the compiler.

Motivating examples or use cases

There are tons of examples, I don't think I have to convince anyone familiar with performance-sensitive code of its use. Binary searches, sorting, filtering of data based on a condition, the applications are nearly endless. In performance-sensitive code, if you write an if where both sides of the if are cheap to evaluate and the condition is not 99%+ consistent there's a good chance a branchless select statement would be faster.

And of course there is the consistency argument even if the branch is 99%+ consistent in most cases, for example audio DSP code might want to ensure a kernel is always consistent in its speed to prevent audio crackling when a rare sequence of unpredictable samples come along, or high-frequency trading code may want to ensure it always has a response in a certain amount of nanoseconds.

Solution sketch

I propose we add the following method to the bool primitive type:

/// Selects one of two values based on the value of self.
///
/// Unlike an `if` statement both arguments are always evaluated. The compiler
/// will try to emit machine code that avoids control flow for this selection,
/// taking the same amount of time regardless of whether self is true or false.
const fn select<T>(self, true_val: T, false_val: T) -> T;

This interface is similar to the std::simd::Mask::select API, just for one value.

Alternatives

There is the select_unpredictable intrinsic. However, it is an intrinsic so it should not really be used. There is no viable alternative, even writing things such as [false_val, true_val][cond as usize] can and do get rewritten by the compiler to branches, or just generate plain terrible code. One can reach for inline assembly but this pessimizes surrounding code and is not portable.

As for bikeshedding about the name, I considered the following alternatives, in order of preference:

select_predicated (see predication)
select_eager
select_branchless
select_unpredictable (as the intrinsic was named)

However, I prefer just the simple select. I think the fact it always eagerly evaluates both arguments (as functions do) is already self-documenting enough to readers to show it is not a control flow structure, but rather a selection operator between two pre-computed values. I think it is fairly natural that this then in turn compiles to a branchless select, even when simply reading cond.select(a, b).

My reason for not favoring the current name of the intrinsic select_unpredictable is that (in my opinion) it confuses a problem with the solution. This can lead to confusing scenarios where the solution is applicable but the problem is not, e.g. someone could make an informed choice to use select_unpredictable while the boolean condition is in fact predictable - they just wanted consistency or avoid conditional jumps for other reasons.

Links and related work

select_unpredictable

The text was updated successfully, but these errors were encountered:

the8472 · 2024-10-22T23:03:26Z

This shouldn't go on the bool type itself. It's more something fit for the std::hint module

orlp · 2024-10-22T23:03:52Z

@the8472 It's not a hint, it's an actual computation - selecting between two values. In fact if you read my ACP a bit more carefully you'll notice I'm trying to move away from the concept of 'unpredictability' in the API completely, which is the only thing that could be hinted about in the first place.

workingjubilee · 2024-10-22T23:04:33Z

std::hint is not std::intrinsics by a different name, and this is not quite proposing to expose the LLVMIR metadata that select_unpredictable emits directly.

chorman0773 · 2024-10-22T23:18:49Z

I'd assume this would be allowed to generate a branch if no other codegen option exists.

ChrisDenton · 2024-10-22T23:19:59Z

It does seem a bit weird to have this as a method on bool. Maybe some code examples would change my mind but it feels too niche to not be a separate function or a wrapper type.

orlp · 2024-10-22T23:21:25Z

I'd assume this would be allowed to generate a branch if no other codegen option exists.

Yes, the description should probably be expanded to explicitly mention this isn't a hard guarantee (even if compiling on a platform that has, e.g. cmov) and that it should not be relied upon for e.g. constant-time cryptography code.

the8472 · 2024-10-22T23:21:51Z

Which then sounds like merely a hint to codegen...

workingjubilee · 2024-10-22T23:25:32Z

in the same sense that unchecked_add is, I suppose?

the8472 · 2024-10-22T23:44:30Z

You have a point. But I'm not convinced yet. Unlike in SIMD llvm does have optimizations that turns cmovs into branches. So there's a difference between computing two values, picking one and then maybe the compiler transforms it depending on what it thinks is profitable and explicitly telling the compiler that you know better.
simd Mask::select doesn't have that because simd generally doesn't do branch prediction (maybe some stuff inside GPUs does?)

chorman0773 · 2024-10-23T00:23:12Z

None of the functions in core::hint have any actual runtime behaviour (unreachable_unchecked and assert_unchecked are either UB or no-ops, black_box is an identity function, and spin_loop is a no-op). bool::select would have actual behaviour, so it definitely doesn't belong in core::hint.

kennytm · 2024-10-23T05:43:13Z

The proposed API does not fit into std::hint. However one could introduce it actually like a hint as

if std::hint::unpredictable(boolean) {
    true_value
} else {
    false_value
}

This should be "optimized" to core::intrinsics::select_unpredicatable if both true_value and false_value can be inlined and side-effect-free so that evaluating both branches is beneficial¹. This mirrors clang's __builtin_unpredictable intrinsic. Additionally, like __builtin_unpredictable it can be extended to support

match std::hint::unpredictable(integer) {
    0 => a,
    1 => b,
    3 => c,
    9 => d,
    27 => e,
    _ => etc,
}

OTOH in rust-lang/rust#120370 core::intrinsics::{likely, unlikely} are being reimplemented in terms of cold_path() because the intrinsics were broken, departing from hinting the condition and instead hinting the branches.

According to the LLVM docs the !unpredictable metadata are supposed to "be attached to any branch or switch instruction", so it may be better aligned with LLVM to spell it as

#[unpredictable]
if boolean { true_value } else { false_value }

#[unpredictable]
match integer { _ => etc }

unfortunately LLVM will lose the !unpredictable metadata if we simply define std::hint::unpredictable(boolean) as core::intrinsics::select_unpredicatable(boolean, true, false), but this may be just an LLVM bug. ↩

orlp · 2024-10-23T09:07:26Z

@kennytm If true_value or false_value have side-effects such as panics (which could be implicit, e.g. writing v[0]) or have invisible potential side-effects since it contains a non-inlined function call, you'd need to write

let true_value = true_expr;
let false_value = false_expr;
#[unpredictable]
if boolean { true_value } else { false_value }

otherwise the Rust compiler is not allowed to write it using eager evaluation + a cmov/csel.

I'd really much rather just write

boolean.select(true_expr, false_expr)

which much closer matches my intent. The whole point of select and branchless programming in general is that there is no control flow (if possible), and a construct using if is very much the wrong way to express this in my opinion.

The fact that the API bool::select always eagerly evaluates both arguments is by-design, and crucial. That is the fundamental property of control-flow free programming: evaluate both options and use indexing, arithmetic, cmov, csel, etc to select the result you want. It's not just a hint, it's a programming paradigm.

workingjubilee · 2024-10-23T09:56:06Z

@kennytm

According to the LLVM docs the !unpredictable metadata are supposed to "be attached to any branch or switch instruction", so it may be better aligned with LLVM to spell it as

That documentation is incorrect and outdated. It does in fact work on select. There is considerable code to propagate it and numerous tests to assure it survives the infamous "cmov-to-branch" pass. It is possible the test coverage is not comprehensive enough, but that's compiler bugs for you.

kennytm · 2024-10-23T12:52:47Z

@workingjubilee I didn't mean !unpredictable did not work on select, I mean !unpredictable is applied on an IR instruction rather than the operator, so #[unpredictable] if a { b } else { c } and a.select(b, c) are closer to what LLVM emits than if unpredictable(a) { b } else { c }.

The bug (?) I mentioned in the footnote is that

#![feature(core_intrinsics)]
pub fn select_unpredictable(b: bool, t: u32, f: u32) -> u32 {
    if std::intrinsics::select_unpredictable(b, true, false) {
        t
    } else {
        f
    }
}

produces

  %t.f = select i1 %b, i32 %t, i32 %f
  ret i32 %t.f

instead of this when you use the intrinsic directly

  %0 = select i1 %b, i32 %t, i32 %f, !unpredictable !3
  ret i32 %0

workingjubilee · 2024-10-23T17:26:58Z

I meant the select LLVMIR instruction.

Did you verify !unpredictable disappears using Godbolt's LLVMIR view of Rust code?

I have found evidence to suggest it is simply incorrect.

workingjubilee · 2024-10-23T17:31:04Z

Oh. If you're just introducing the knowledge that trying to reiterate clang's implementation of __builtin_unpredictable (which resembles the intrinsics::unlikely API) is horribly broken, that is true. I am just confused as to why you would mention the incorrect documentation, then.

kennytm · 2024-10-23T18:32:29Z

@workingjubilee if https://llvm.org/docs/LangRef.html is not correct then why it is not fixed 😅, the title literally said "LLVM 20.0.0git documentation" and reachable from https://llvm.org/docs/Reference.html#llvm-ir ← https://llvm.org/docs/#documentation ← https://llvm.org/. Also I don't see how that phrase I've quoted implies attaching !unpredictable to select is prohibited.

godbolt for reference.

Voultapher · 2024-11-04T09:02:56Z

For what it's worth, this API gets a +1 from me. Efficient programs need ways to avoid expensive branch mispredicts. Generally branchless - or less confusingly named jumpless - programming is achieved via a small toolkit of building blocks:

Conditional increment e.g. num_x += cond as usize, generates something like inc or adc.
Perform both sides of the branch unconditionally and mask or ignore unwanted side.
Value selection e.g. let x = if cond { a } else { b } generates something like cmov or setb with luck.
Numeric logic, xor, bit and etc. only applicable for offsets or if the data handling isn't generic.
SIMD masking.

Out of this non-exhaustive list of options one relies particularly heavy on compiler co-operation, branchless value selection.

scottmcm · 2024-11-05T08:04:53Z

I wonder if it'd be worth adding both select and select_unpredictable, the same way that LLVM doesn't say that all selects are necessarily unpredictable.

I wouldn't mind using select to emphasize that I'm intentionally calculating the values in both paths, even if I'm fine letting LLVM (and PGO) decide whether it's really best as jumps or as conditional moves.

And the version with the most constraints on codegen having a longer name is probably good.

(And I'm also tempted to add it to MIR, because needing 4 basic blocks for every ?:-equivalent is wasteful, and a version that doesn't have different implications from the branching would be nice so that optimizations could optimize let z = if b { x } else { y }; to a select without forcing !unpredictable. That doesn't need a public API, though, of course.)

Voultapher · 2024-11-05T09:45:50Z

@scottmcm I had a similar thought as well. There are different scenarios and and either select or select_unpredictable by themselves aren't necessarily a good fit for all of them. In theory we could add only select_unpredictable, or called select with the LLVM flags for unpredictable, and have if cond { a } else { b} favor generating branchy code, but I suspect that would break a substantial amount of existing optimizations, so probably best to go with a new opt-in API and leave the current code-gen untouched.

When a programmer writes if ... it's not always clear what the intent is, and without PGO the compiler has to guess the mispredict penalty. LLVM tends to prefer branchless code while gcc is less aggressive about making such optimization. For example here https://cpp.godbolt.org/z/esxsdnvxf clang optimizes both functions to use cmov while gcc decides to use cmov in one case and a branch in the other. So giving developers a chance to express their intent in both cases seems like a good idea to me.

joshtriplett · 2024-11-05T18:56:31Z

We discussed this in today's @rust-lang/libs-api meeting.

We'd like to see this called select_unpredictable, with the justification that this would be a pessimization that you shouldn't use if your condition is predictable.

Otherwise: 👍 🚢

Amanieu · 2024-11-05T19:02:58Z

Right, you don't want to be using this unless you specifically want to tell the compiler that the condition is unpredictable by a branch predictor. Otherwise it is better to let LLVM's heuristics decide whether to use a branch or conditional move.

We also rejected other names such as select_branchless because we don't actually guarantee any particular compiler output (e.g. platforms without conditional move instructions). Instead, we only give a hint that the condition is likely to be unpredictable to a branch predictor.

joboet · 2024-12-06T14:26:46Z

I've implemented this in rust-lang/rust#133964.

…oss35 core: implement `bool::select_unpredictable` Tracking issue: rust-lang#133962 ACP: rust-lang/libs-team#468

Rollup merge of rust-lang#133964 - joboet:select_unpredictable, r=tgross35 core: implement `bool::select_unpredictable` Tracking issue: rust-lang#133962 ACP: rust-lang/libs-team#468

orlp added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 22, 2024

joshtriplett added the ACP-accepted API Change Proposal is accepted (seconded with no objections) label Nov 5, 2024

Amanieu closed this as completed Nov 5, 2024

This was referenced Dec 6, 2024

Tracking Issue for bool::select_unpredictable rust-lang/rust#133962

Open

core: implement bool::select_unpredictable rust-lang/rust#133964

Merged

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 4, 2025

Rollup merge of rust-lang#133964 - joboet:select_unpredictable, r=tgr…

695da5b

…oss35 core: implement `bool::select_unpredictable` Tracking issue: rust-lang#133962 ACP: rust-lang/libs-team#468

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACP: add bool::select #468

ACP: add bool::select #468

orlp commented Oct 22, 2024 •

edited

Loading

the8472 commented Oct 22, 2024

orlp commented Oct 22, 2024 •

edited

Loading

workingjubilee commented Oct 22, 2024 •

edited

Loading

chorman0773 commented Oct 22, 2024

ChrisDenton commented Oct 22, 2024

orlp commented Oct 22, 2024

the8472 commented Oct 22, 2024

workingjubilee commented Oct 22, 2024

the8472 commented Oct 22, 2024

chorman0773 commented Oct 23, 2024

kennytm commented Oct 23, 2024 •

edited

Loading

orlp commented Oct 23, 2024 •

edited

Loading

workingjubilee commented Oct 23, 2024 •

edited

Loading

kennytm commented Oct 23, 2024

workingjubilee commented Oct 23, 2024

workingjubilee commented Oct 23, 2024 •

edited

Loading

kennytm commented Oct 23, 2024

Voultapher commented Nov 4, 2024

scottmcm commented Nov 5, 2024 •

edited

Loading

Voultapher commented Nov 5, 2024 •

edited

Loading

joshtriplett commented Nov 5, 2024

Amanieu commented Nov 5, 2024

joboet commented Dec 6, 2024

ACP: add bool::select #468

ACP: add bool::select #468

Comments

orlp commented Oct 22, 2024 • edited Loading

Proposal

Problem statement

Motivating examples or use cases

Solution sketch

Alternatives

Links and related work

the8472 commented Oct 22, 2024

orlp commented Oct 22, 2024 • edited Loading

workingjubilee commented Oct 22, 2024 • edited Loading

chorman0773 commented Oct 22, 2024

ChrisDenton commented Oct 22, 2024

orlp commented Oct 22, 2024

the8472 commented Oct 22, 2024

workingjubilee commented Oct 22, 2024

the8472 commented Oct 22, 2024

chorman0773 commented Oct 23, 2024

kennytm commented Oct 23, 2024 • edited Loading

Footnotes

orlp commented Oct 23, 2024 • edited Loading

workingjubilee commented Oct 23, 2024 • edited Loading

kennytm commented Oct 23, 2024

workingjubilee commented Oct 23, 2024

workingjubilee commented Oct 23, 2024 • edited Loading

kennytm commented Oct 23, 2024

Voultapher commented Nov 4, 2024

scottmcm commented Nov 5, 2024 • edited Loading

Voultapher commented Nov 5, 2024 • edited Loading

joshtriplett commented Nov 5, 2024

Amanieu commented Nov 5, 2024

joboet commented Dec 6, 2024

orlp commented Oct 22, 2024 •

edited

Loading

orlp commented Oct 22, 2024 •

edited

Loading

workingjubilee commented Oct 22, 2024 •

edited

Loading

kennytm commented Oct 23, 2024 •

edited

Loading

orlp commented Oct 23, 2024 •

edited

Loading

workingjubilee commented Oct 23, 2024 •

edited

Loading

workingjubilee commented Oct 23, 2024 •

edited

Loading

scottmcm commented Nov 5, 2024 •

edited

Loading

Voultapher commented Nov 5, 2024 •

edited

Loading