Skip to content
This repository has been archived by the owner on Mar 9, 2022. It is now read-only.

Run performance regression checks in TravisCI #38

Merged
merged 4 commits into from
Feb 27, 2018
Merged

Run performance regression checks in TravisCI #38

merged 4 commits into from
Feb 27, 2018

Conversation

mjs
Copy link
Contributor

@mjs mjs commented Feb 26, 2018

Closes #33.

@mjs mjs requested review from amnonbc and oplehto February 26, 2018 21:20
Copy link
Contributor

@oplehto oplehto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@oplehto
Copy link
Contributor

oplehto commented Feb 26, 2018

I didn't merge yet as there are some failures. I haven't used Travis much before but I suspect that the performance might vary depending on the VM that the Travis backend is scheduling the work on. Thus it might be difficult to get reliable performance results?

@mjs
Copy link
Contributor Author

mjs commented Feb 26, 2018

I tried to mitigate against this by have perfcheck collect benchmarks for the current revision and then collect benchmarks for the reference revision within the same test run so at least the benchmarks are collected and compared on the same VM at around the same time.

It's looking like that's not quite enough though. This is likely because the underlying VM host has highly variable load.

Maybe there needs to be more runs of benchmarks with the minimums for each benchmark being compared, but that might still not be enough.

I could also look at increasing the maximum allowed variance (currently 10%).

@oplehto
Copy link
Contributor

oplehto commented Feb 27, 2018

Adding some more iterations to the small test runs (perhaps they need to be called "medium" at that point) might help reduce effects caused by jitter.

If necessary I don't mind increasing the variance to, say 30%. That should help us catch the most serious regressions. For the more fine-grained performance testing we can use our own dedicated boxes.

@mjs
Copy link
Contributor Author

mjs commented Feb 27, 2018

I've been experiementing with Russ Cox's benchstat tool. It analyses the output of multiple benchmark runs for 2 revisions to compare and reports if the change is statistically significant. This is somewhat smarter than what benchcmp/benchcheck does.

It looks promising on my machine. I'm now gathering data from a run on TravisCI with a modified perfcheck script which runs the benchmarks 5 times the current and reference revision (interleaved to minimise the effect of other load on the host) to see how it would work there.

If it works out we can get rid of the benchcheck tool (which a hacked version of benchcmp.

mjs added 2 commits February 27, 2018 17:58
Instead of being based on benchcmp benchcmp now uses
benchstat. benchstat analyses multiple benchmark runs and only reports
problems when statistically signficant. This helps to avoid false
positives when looking for performance regressions.

Side benefit: The new benchcheck involves much less code.
To avoid false positives, reference and current benchmarks are now run
5 times - interleaved to minimise the effects of host system load. The
outputs from each run are concatenated together resulting in a file of
benchmark output for the reference revision & the current
revision. These are then passed to the new benchstat-based benchcheck
tool.

This is approach is much more reliable than the previous approach.
@mjs
Copy link
Contributor Author

mjs commented Feb 27, 2018

I've just pushed the new approach here. I'm anticipating that the perfcheck run should now pass on TravisCI.

There ended up being quite a bit of churn so another review from @oplehto would be helpful.

@oplehto
Copy link
Contributor

oplehto commented Feb 27, 2018

Looks good! Tested it successfully on my system. Very useful to learn about the existence of the benchcheck utility.

To speed up testing I think we can drop Go 1.8.x at this time. What do you think?

@oplehto oplehto merged commit 5249beb into jumptrading:master Feb 27, 2018
@amnonbc
Copy link
Contributor

amnonbc commented Feb 27, 2018

+1 for dropping 1.8

@mjs mjs deleted the enable-perfcheck branch February 27, 2018 08:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants