Optimize common path of Once::doit #14174

stepancheg · 2014-05-13T10:25:57Z

Submitting PR again, because I cannot reopen #13349, and github does not attach new patch to that PR.

Optimize Once::doit: perform optimistic check that initializtion is
already completed. load is much cheaper than fetch_add at least
on x86_64.

Verified with this test:

static mut o: one::Once = one::ONCE_INIT;
unsafe {
    loop {
        let start = time::precise_time_ns();
        let iters = 50000000u64;
        for _ in range(0, iters) {
            o.doit(|| { println!("once!"); });
        }
        let end = time::precise_time_ns();
        let ps_per_iter = 1000 * (end - start) / iters;
        println!("{} ps per iter", ps_per_iter);

        // confuse the optimizer
        o.doit(|| { println!("once!"); });
    }
}

Test executed on Mac, Intel Core i7 2GHz. Result is:

20ns per iteration without patch
4ns per iteration with this patch applied

Once.doit could be even faster (800ps per iteration), if doit function
was split into a pair of doit/doit_slow, and doit marked as
#[inline] like this:

#[inline(always)]
pub fn doit(&self, f: ||) {
    if self.cnt.load(atomics::SeqCst) < 0 {
        return
    }

    self.doit_slow(f);
}

fn doit_slow(&self, f: ||) { ... }

lilyball · 2014-05-13T18:16:20Z

The message on this PR doesn't describe what's actually being landed (both the PR description and the commit message). Can you please rewrite it to actually describe what's going on? You can put the extra information back in a comment for posterity's sake, but I'd rather not have the message suggest that it's being marked as #[inline] or being split.

Optimize `Once::doit`: perform optimistic check that initializtion is already completed. `load` is much cheaper than `fetch_add` at least on x86_64. Verified with this test: ``` static mut o: one::Once = one::ONCE_INIT; unsafe { loop { let start = time::precise_time_ns(); let iters = 50000000u64; for _ in range(0, iters) { o.doit(|| { println!("once!"); }); } let end = time::precise_time_ns(); let ps_per_iter = 1000 * (end - start) / iters; println!("{} ps per iter", ps_per_iter); // confuse the optimizer o.doit(|| { println!("once!"); }); } } ``` Test executed on Mac, Intel Core i7 2GHz. Result is: * 20ns per iteration without patch * 4ns per iteration with this patch applied Once.doit could be even faster (800ps per iteration), if `doit` function was split into a pair of `doit`/`doit_slow`, and `doit` marked as `#[inline]` like this: ``` #[inline(always)] pub fn doit(&self, f: ||) { if self.cnt.load(atomics::SeqCst) < 0 { return } self.doit_slow(f); } fn doit_slow(&self, f: ||) { ... } ```

stepancheg · 2014-05-14T10:33:08Z

Updated the patch and PR description to make it clear, that doit is not split and not inlined.

Closes rust-lang#14210 (Make Vec.truncate() resilient against failure in Drop) Closes rust-lang#14206 (Register new snapshots) Closes rust-lang#14205 (use sched_yield on linux and freebsd) Closes rust-lang#14204 (Add a crate for missing stubs from libcore) Closes rust-lang#14201 (Render not_found with an absolute path to the rust stylesheet) Closes rust-lang#14198 (update valgrind headers) Closes rust-lang#14174 (Optimize common path of Once::doit) Closes rust-lang#14162 (Print 'rustc' and 'rustdoc' as the command name for --version) Closes rust-lang#14145 (Better strict version hash (SVH) computation)

Submitting PR again, because I cannot reopen #13349, and github does not attach new patch to that PR. ======= Optimize `Once::doit`: perform optimistic check that initializtion is already completed. `load` is much cheaper than `fetch_add` at least on x86_64. Verified with this test: ``` static mut o: one::Once = one::ONCE_INIT; unsafe { loop { let start = time::precise_time_ns(); let iters = 50000000u64; for _ in range(0, iters) { o.doit(|| { println!("once!"); }); } let end = time::precise_time_ns(); let ps_per_iter = 1000 * (end - start) / iters; println!("{} ps per iter", ps_per_iter); // confuse the optimizer o.doit(|| { println!("once!"); }); } } ``` Test executed on Mac, Intel Core i7 2GHz. Result is: * 20ns per iteration without patch * 4ns per iteration with this patch applied Once.doit could be even faster (800ps per iteration), if `doit` function was split into a pair of `doit`/`doit_slow`, and `doit` marked as `#[inline]` like this: ``` #[inline(always)] pub fn doit(&self, f: ||) { if self.cnt.load(atomics::SeqCst) < 0 { return } self.doit_slow(f); } fn doit_slow(&self, f: ||) { ... } ```

Use sync::one::Once to fetch the mach_timebase_info only once when running precise_time_ns(). This helps because mach_timebase_info() is surprisingly inefficient. Also fix the order of operations when applying the timebase to the mach absolute time value. This improves the time on my machine from ``` test tests::bench_precise_time_ns ... bench: 157 ns/iter (+/- 4) ``` to ``` test tests::bench_precise_time_ns ... bench: 38 ns/iter (+/- 3) ``` and it will get even faster once rust-lang#14174 lands.

…chton Use sync::one::Once to fetch the mach_timebase_info only once when running precise_time_ns(). This helps because mach_timebase_info() is surprisingly inefficient. Also fix the order of operations when applying the timebase to the mach absolute time value. This improves the time on my machine from ``` test tests::bench_precise_time_ns ... bench: 157 ns/iter (+/- 4) ``` to ``` test tests::bench_precise_time_ns ... bench: 38 ns/iter (+/- 3) ``` and it will get even faster once #14174 lands.

lilyball mentioned this pull request May 15, 2014

Optimize and fix time::precise_time_ns() on macos #14216

Merged

bors closed this May 15, 2014

bors merged commit f853cf7 into rust-lang:master May 15, 2014

stepancheg deleted the once branch May 15, 2014 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize common path of Once::doit #14174

Optimize common path of Once::doit #14174

stepancheg commented May 13, 2014

lilyball commented May 13, 2014

stepancheg commented May 14, 2014

Optimize common path of Once::doit #14174

Optimize common path of Once::doit #14174

Conversation

stepancheg commented May 13, 2014

lilyball commented May 13, 2014

stepancheg commented May 14, 2014