replace with_iterations with with_test_time #206

Ekleog · 2024-02-24T02:40:00Z

I've noticed that some of my bolero tests are very unbelievably slow to run. This is because bolero runs through the whole local corpus on each run, and my tests are fundamentally slow (~10 execs / second, with a multi-thousand corpus).

My solution up to now was to have another corpus directory, but it means that I cannot use the base bolero commands, etc.

Instead, here is an idea: what if we replaced with_iterations with with_test_time? What's important for the user is the time taken by the test anyway, considering that the tests are random anyway. And computers can have vastly different speeds.

In addition, doing things this way allows setting a reasonable default test duration (1s) that's independent of how slow the tests themselves are, and that test duration will stop the tests even if they're still in the process of going through the corpus.

WDYT? :)

camshaft

I like the idea, though I think we should preserve the iterations option as well

camshaft · 2024-02-24T16:28:07Z

lib/bolero-engine/src/rng.rs

-                }
-                Ok(false) => {
-                    invalid += 1;
-                    if invalid > self.iterations * 2 {


Removing this is a regression. This is here to prevent implementing unsatisfiable generators and not knowing they're unsatisfiable:

let g = gen::<u64>().map(|v| false); check!().with_generator(g).for_each(|_| {});

Currently, this harness will panic with this message. After this change, it'll silently accept it.

I tried to understand the check and missed that this was the goal — in practice it seems to have validated that at least 1/3 (amortized) of the inputs are valid. And actually bolero counting only valid iterations, that I discovered while making this change, did explain some weird behavior I had seen while trying to debug why some of my generators were taking so long to run 😅

Anyway, I reintroduced this check with the same (1/3) requirements, along with reintroducing the with_iterations parameter :)

lib/bolero/src/lib.rs

Ekleog · 2024-02-25T05:41:04Z

Hmmm seems like the tests are failing due to kani on MacOS…? Unfortunately I know literally nothing about kani, do you have any idea what this could be due to?

error: internal compiler error: Kani unexpectedly panicked at panicked at cprover_bindings/src/goto_program/stmt.rs:170:9:
thread 'rustc' panicked at cprover_bindings/src/goto_program/stmt.rs:170:9:
                                assertion `left == right` failed: Error: assign statement with unequal types lhs Pointer { typ: StructTag("tag-_98053125857340468998566196306834697786") } rhs CInteger(SSizeT)
assertion `left == right` failed: Error: assign statement with unequal types lhs Pointer { typ: StructTag("tag-_98053125857340468998566196306834697786") } rhs CInteger(SSizeT)
  left: Pointer { typ: StructTag("tag-_98053125857340468998566196306834697786") }
 right: CInteger(SSizeT)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Kani unexpectedly panicked during compilation.
Please file an issue here: https://github.com/model-checking/kani/issues/new?labels=bug&template=bug_report.md

[Kani] current codegen item: codegen_function: std::sync::atomic::atomic_sub::<*mut std::sys::pal::unix::locks::queue_rwlock::Node>
_ZN4core4sync6atomic10atomic_sub17h64afa658122eff2eE
[Kani] current codegen location: Loc { file: "/Users/runner/.rustup/toolchains/nightly-2024-02-14-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/sync/atomic.rs", function: None, start_line: 3352, start_col: Some(1), end_line: 3352, end_col: Some(73) }
                                  left: Pointer { typ: StructTag("tag-_98053125857340468998566196306834697786") }
                                 right: CInteger(SSizeT).

error: could not compile `basic` (lib test)
error: Failed to compile `basic` due to an internal compiler error.: error: internal compiler error: Kani unexpectedly panicked at panicked at cprover_bindings/src/goto_program/stmt.rs:170:9:
                                assertion `left == right` failed: Error: assign statement with unequal types lhs Pointer { typ: StructTag("tag-_98053125857340468998566196306834697786") } rhs CInteger(SSizeT)
                                  left: Pointer { typ: StructTag("tag-_98053125857340468998566196306834697786") }
                                 right: CInteger(SSizeT).


error: getting status result from command "cargo" "kani" "--tests" "--harness" "tests::add_test": process exited with status 1

camshaft · 2024-02-25T13:16:54Z

Hmm none of the changes you've made here should interact with Kani, so I'm assuming it's failing even outside of these changes. I will open a PR up with them; it should never encounter an internal compiler error.

lib/bolero-engine/src/rng.rs

camshaft · 2024-02-25T13:25:06Z

lib/bolero-engine/src/rng.rs

        let mut valid = 0;
        let mut invalid = 0;
-        while valid < self.iterations {
+        while valid < self.iterations && start_time.elapsed() < self.test_time {


Maybe as an optimization we can avoid querying the clock if test_time == Duration::MAX.

I don't know if you've had time to profile this change, but I'd imagine the elapsed call isn't free. This ends up mattering for harnesses that are super cheap to run. I wonder if it might be better to spawn a background thread that just sleeps for this amount of time and sets an AtomicBool when it wakes up? Or maybe it's as simple as only querying every N iterations (where N is like 2-5)?

TL;DR: I would be very surprised if any real-world fuzzer were to be slowed down in any noticeable manner by this change, and I think the early optimization would bring more complexity than it's worth.

A few performance notes:

Back in 2010, Solaris executed clock_gettime in 0.6μs. This was probably before all the optimizations I'm going to discuss below.

On Linux, rustc uses clock_gettime. It is implemented here (this is the version that does not take advantage of any architecture-specific optimization). This implementation, for most clocks, is based on a shared memory page, that Linux updates at some regularity. Then, if high-performance counters are required, it uses rdtsc to make things more precise. This means that computing the current time does not require a syscall and can be very fast

Internet browsers, at some point (at least that's what I was told at Mozilla in 2016), were hogged by gettimeofday, where a significant amount of time was used there, for reasons that were not possible to avoid due to the specification. This pushed for lots of performance improvements in time handling, and in particular this VDSO-only computation of time that does not require a syscall

With the default 1000 iterations, even with the performance of solaris in 2010, the total overhead would be 0.6ms, which is not human-noticeable.

For random fuzzing of functions that take less than 10μs (100k execs/s) to run the overhead would have been noticeable (6%) back in 2010, but do these actually exist? I don't think I actually have fuzzers that run more than 1k execs/s, except for a few trivial ones that had gotten to basically full feature coverage in a few minutes and so are not worth optimizing for. And anyway I'd expect random fuzzing to be a very marginal use case anyway, as it's so much worse than coverage-based fuzzing :)

All good points 🙂

camshaft · 2024-02-25T13:27:27Z

lib/bolero/src/test/mod.rs

        for test in tests {
+            if start_time.elapsed() > test_time {


Same thing here

camshaft · 2024-02-25T13:29:17Z

lib/bolero/src/test/mod.rs

@@ -90,7 +95,7 @@ impl TestEngine {
            .map(move |seed| input::RngTest { seed, max_len })
    }

-    fn tests(&self) -> Vec<NamedTest> {
+    fn tests(&self) -> impl Iterator<Item = NamedTest> {


Hmm I guess this should be a nice performance improvement, especially when tests fail on the first few iterations.

Ekleog · 2024-02-27T21:14:40Z

Thank you! :D When do you think you could release? I'm thinking this would be a very useful change to pull in on my end (in a soon-to-be-published crate, so it's hard to just patch), so that I could replace my current batch of with_iterations(20) that take hours due to running the whole corpus anyway :)

camshaft · 2024-02-27T22:54:17Z

I won't be able to release it until next Monday (I'm on vacation with no computer 😁). But I can add you to the crate owners if you're wanting it sooner. Just let me know which GitHub handle you'd like me to add, if so.

Ekleog · 2024-02-28T14:43:51Z

Of course, no hurry and have a good vacation! I'll be mostly away from computer until next Tuesday anyway, I was just asking to know if there's anything else that you wanted to do before release, like last time :)

As for the multiple github handles, basically Ekleog-NEAR is the account I use on my work machine and Ekleog on my personal machine, hence the two accounts. (but soon we're going to switch back to individually-administrated work machines so I'll probably just ditch Ekleog-NEAR then)

camshaft reviewed Feb 24, 2024

View reviewed changes

Ekleog force-pushed the with-test-time branch 2 times, most recently from a442768 to c937af8 Compare February 25, 2024 04:53

replace with_iterations with with_test_time

ced6ce5

Ekleog force-pushed the with-test-time branch from c937af8 to ced6ce5 Compare February 25, 2024 04:57

camshaft reviewed Feb 25, 2024

View reviewed changes

handle review comments

425dc7e

camshaft approved these changes Feb 26, 2024

View reviewed changes

camshaft merged commit af7191e into camshaft:master Feb 26, 2024
6 of 9 checks passed

Ekleog deleted the with-test-time branch February 28, 2024 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replace with_iterations with with_test_time #206

replace with_iterations with with_test_time #206

Ekleog commented Feb 24, 2024

camshaft left a comment

camshaft Feb 24, 2024

Ekleog Feb 25, 2024

Ekleog commented Feb 25, 2024

camshaft commented Feb 25, 2024

camshaft Feb 25, 2024

Ekleog Feb 26, 2024

camshaft Feb 26, 2024

camshaft Feb 25, 2024

camshaft Feb 25, 2024

Ekleog commented Feb 27, 2024

camshaft commented Feb 27, 2024

Ekleog commented Feb 28, 2024 •

edited

Loading

replace with_iterations with with_test_time #206

replace with_iterations with with_test_time #206

Conversation

Ekleog commented Feb 24, 2024

camshaft left a comment

Choose a reason for hiding this comment

camshaft Feb 24, 2024

Choose a reason for hiding this comment

Ekleog Feb 25, 2024

Choose a reason for hiding this comment

Ekleog commented Feb 25, 2024

camshaft commented Feb 25, 2024

camshaft Feb 25, 2024

Choose a reason for hiding this comment

Ekleog Feb 26, 2024

Choose a reason for hiding this comment

camshaft Feb 26, 2024

Choose a reason for hiding this comment

camshaft Feb 25, 2024

Choose a reason for hiding this comment

camshaft Feb 25, 2024

Choose a reason for hiding this comment

Ekleog commented Feb 27, 2024

camshaft commented Feb 27, 2024

Ekleog commented Feb 28, 2024 • edited Loading

Ekleog commented Feb 28, 2024 •

edited

Loading