Run performance regression checks in TravisCI #38

mjs · 2018-02-26T21:20:11Z

Closes #33.

oplehto

LGTM

oplehto · 2018-02-26T23:43:24Z

I didn't merge yet as there are some failures. I haven't used Travis much before but I suspect that the performance might vary depending on the VM that the Travis backend is scheduling the work on. Thus it might be difficult to get reliable performance results?

mjs · 2018-02-26T23:53:53Z

I tried to mitigate against this by have perfcheck collect benchmarks for the current revision and then collect benchmarks for the reference revision within the same test run so at least the benchmarks are collected and compared on the same VM at around the same time.

It's looking like that's not quite enough though. This is likely because the underlying VM host has highly variable load.

Maybe there needs to be more runs of benchmarks with the minimums for each benchmark being compared, but that might still not be enough.

I could also look at increasing the maximum allowed variance (currently 10%).

oplehto · 2018-02-27T00:31:56Z

Adding some more iterations to the small test runs (perhaps they need to be called "medium" at that point) might help reduce effects caused by jitter.

If necessary I don't mind increasing the variance to, say 30%. That should help us catch the most serious regressions. For the more fine-grained performance testing we can use our own dedicated boxes.

Closes #33.

mjs · 2018-02-27T03:37:05Z

I've been experiementing with Russ Cox's benchstat tool. It analyses the output of multiple benchmark runs for 2 revisions to compare and reports if the change is statistically significant. This is somewhat smarter than what benchcmp/benchcheck does.

It looks promising on my machine. I'm now gathering data from a run on TravisCI with a modified perfcheck script which runs the benchmarks 5 times the current and reference revision (interleaved to minimise the effect of other load on the host) to see how it would work there.

If it works out we can get rid of the benchcheck tool (which a hacked version of benchcmp.

Instead of being based on benchcmp benchcmp now uses benchstat. benchstat analyses multiple benchmark runs and only reports problems when statistically signficant. This helps to avoid false positives when looking for performance regressions. Side benefit: The new benchcheck involves much less code.

To avoid false positives, reference and current benchmarks are now run 5 times - interleaved to minimise the effects of host system load. The outputs from each run are concatenated together resulting in a file of benchmark output for the reference revision & the current revision. These are then passed to the new benchstat-based benchcheck tool. This is approach is much more reliable than the previous approach.

mjs · 2018-02-27T05:07:20Z

I've just pushed the new approach here. I'm anticipating that the perfcheck run should now pass on TravisCI.

There ended up being quite a bit of churn so another review from @oplehto would be helpful.

oplehto · 2018-02-27T06:03:44Z

Looks good! Tested it successfully on my system. Very useful to learn about the existence of the benchcheck utility.

To speed up testing I think we can drop Go 1.8.x at this time. What do you think?

amnonbc · 2018-02-27T06:18:04Z

+1 for dropping 1.8

mjs requested review from amnonbc and oplehto February 26, 2018 21:20

oplehto approved these changes Feb 26, 2018

View reviewed changes

mjs added the do not merge label Feb 26, 2018

Run performance regression checks in TravisCI

a35b9dd

Closes #33.

mjs added 2 commits February 27, 2018 17:58

Add copyright header to benchcheck

8257202

oplehto merged commit 5249beb into jumptrading:master Feb 27, 2018

oplehto removed the do not merge label Feb 27, 2018

mjs deleted the enable-perfcheck branch February 27, 2018 08:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run performance regression checks in TravisCI #38

Run performance regression checks in TravisCI #38

mjs commented Feb 26, 2018

oplehto left a comment

oplehto commented Feb 26, 2018 •

edited

Loading

mjs commented Feb 26, 2018 •

edited

Loading

oplehto commented Feb 27, 2018

mjs commented Feb 27, 2018

mjs commented Feb 27, 2018

oplehto commented Feb 27, 2018

amnonbc commented Feb 27, 2018

Run performance regression checks in TravisCI #38

Run performance regression checks in TravisCI #38

Conversation

mjs commented Feb 26, 2018

oplehto left a comment

Choose a reason for hiding this comment

oplehto commented Feb 26, 2018 • edited Loading

mjs commented Feb 26, 2018 • edited Loading

oplehto commented Feb 27, 2018

mjs commented Feb 27, 2018

mjs commented Feb 27, 2018

oplehto commented Feb 27, 2018

amnonbc commented Feb 27, 2018

oplehto commented Feb 26, 2018 •

edited

Loading

mjs commented Feb 26, 2018 •

edited

Loading