Improve precision of mypy performance tracking #14358

JukkaL · 2022-12-28T12:46:12Z

We automatically track changes in mypy performance over time (#14187). Currently we can detect changes of at least 1.5% pretty reliably, but smaller changes are hard to detect. #14187 has some relevant discussion, such as this comment: #14187 (comment)

I'd estimate that a cumulative performance regression of around 15% in 2022 was due to changes that were below the 1.5% noise floor. Getting the detection threshold down to 0.5% or below could be quite helpful in finding and fixing regressions.

I looked at individual measurements, and it seems possible that measurements slowly fluctuate over time. I'm not entirely sure what might be causing this. Just increasing the number of iterations we measure probably won't help much, since different batches of runs will cluster around different averages.

Here are some things that could help:

Interleave executions of current/previous builds and measure the delta. Instead of only collecting absolute performance values, interleave runs using the previous commit and the target commit and calculate the average delta. If performance gradually fluctuates over time, this should help.
Further tweak the configuration of the runner machine for stability. See Configure benchmark machine for maximal stability scala/scala-dev#338 as suggested by @A5rocks.
Collect samples over a long period of time (say, 1 sample every hour over 12 hours).
Collect detailed profiling data for each commit and also highlight differences in the time spent in different parts of the mypy implementation. If a single function gets 2x slower, it could be easy to detect this way, even if the change in overall performance is well below the noise floor. This could be quite noisy due to renaming/splitting functions, etc.

I'm going to start by investigating if the idea 1 seems feasible.

JukkaL added the topic-developer Issues relevant to mypy developers label Dec 28, 2022

JukkaL mentioned this issue Dec 28, 2022

subtypes: fast path for Union/Union subtype check #14277

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve precision of mypy performance tracking #14358

Improve precision of mypy performance tracking #14358

JukkaL commented Dec 28, 2022

Improve precision of mypy performance tracking #14358

Improve precision of mypy performance tracking #14358

Comments

JukkaL commented Dec 28, 2022