Run performance tests on all PR against base branch #1630

peaBerberian · 2025-01-14T12:42:24Z

This PR does multiple updates with our performance tests (which test for performance regressions):

I simplified performance tests writing by adding a lib.js file handling test management - and I added some multithread tests.
I now choose to run performance tests in CI for all PR
Performance tests in CI are now run against the base branch (instead of against the last RxPlayer release )
A comment is left on the GitHub PR if performance issues are detected

Florent-Bouisset · 2025-01-21T16:25:36Z

Automated performance checks have found a sensible difference on commit 4896468988a863f1dc290bc065d5f9f3377e2c64 with the base branch dev:

Performance tests 1st run output:

No significative change in performance for tests:

loading mean: 19.28ms -> 19.41ms (-0.129ms, -0.666%, z: 1.47799) / median: 26.70ms -> 26.70ms

seeking mean: 12.04ms -> 14.77ms (-2.738ms, -18.532%, z: 1.16664) / median: 11.25ms -> 11.40ms

audio-track-reload mean: 25.96ms -> 26.08ms (-0.125ms, -0.478%, z: 1.08775) / median: 38.10ms -> 38.10ms

cold loading multithread mean: 48.07ms -> 47.49ms (0.589ms, 1.241%, z: 9.61539) / median: 70.20ms -> 69.15ms

seeking multithread mean: 12.62ms -> 23.30ms (-10.675ms, -45.815%, z: 0.25505) / median: 10.50ms -> 10.35ms

audio-track-reload multithread mean: 25.98ms -> 25.79ms (0.190ms, 0.736%, z: 2.12765) / median: 38.25ms -> 38.10ms

hot loading multithread mean: 15.30ms -> 15.24ms (0.059ms, 0.387%, z: 3.20104) / median: 22.20ms -> 22.05ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

Maybe we can have it displayed as a table so it's more readable:

Action	Mean	Median
Loading	19.28ms → 19.41ms (-0.129ms, -0.666%n, z: 1.455)	26.70ms → 26.70ms
Seeking	19.28ms → 19.41ms (-0.129ms, -0.666%, z: 1.455)	26.70ms → 26.70ms

peaBerberian · 2025-01-21T16:28:59Z

OK though to make it also readable as a text file (the intent is that the outputed report markdown file is both intended for humans with a text editor and to scripts outputing optionally richer text formats - like as a Github comment) is more complex when doing a table vs a list.

But i'll try

github-actions · 2025-01-21T18:55:47Z

Automated performance checks have been performed on commit 5798686388a3dec9eb469f64e6b386b0d4358828 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name	Mean	Median
loading	20.26ms -> 20.60ms (-0.337ms, z: 0.96412)	27.00ms -> 27.15ms
seeking	17.56ms -> 14.42ms (3.141ms, z: 2.34821)	11.10ms -> 11.25ms
audio-track-reload	26.22ms -> 26.43ms (-0.215ms, z: 0.20465)	37.65ms -> 37.65ms
cold loading multithread	48.59ms -> 47.85ms (0.736ms, z: 8.78710)	69.75ms -> 68.55ms
seeking multithread	20.70ms -> 17.48ms (3.221ms, z: 0.57881)	10.20ms -> 10.20ms
audio-track-reload multithread	26.30ms -> 26.02ms (0.280ms, z: 0.52428)	37.65ms -> 37.65ms
hot loading multithread	15.80ms -> 15.42ms (0.377ms, z: 3.90596)	22.05ms -> 21.75ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

github-actions · 2025-01-22T13:20:02Z

Automated performance checks have been performed on commit 628dc3be94709c1c96840e8d4b8a3d486ac58fd5 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name	Mean	Median
loading	19.22ms -> 19.30ms (-0.079ms, z: 1.62917)	26.55ms -> 26.55ms
seeking	13.31ms -> 10.65ms (2.653ms, z: 0.09964)	11.25ms -> 11.25ms
audio-track-reload	25.84ms -> 25.88ms (-0.042ms, z: 0.13650)	37.65ms -> 37.65ms
cold loading multithread	47.96ms -> 47.10ms (0.856ms, z: 12.99353)	70.20ms -> 68.85ms
seeking multithread	16.55ms -> 18.57ms (-2.022ms, z: 0.72020)	10.50ms -> 10.35ms
audio-track-reload multithread	25.68ms -> 25.73ms (-0.050ms, z: 0.30880)	37.65ms -> 37.80ms
hot loading multithread	15.20ms -> 15.19ms (0.016ms, z: 3.84216)	22.10ms -> 21.90ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

github-actions · 2025-01-22T13:58:10Z

Automated performance checks have been performed on commit 7458f0240d24c9dfacff9836a9e0dc7adeec0f4a with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name	Mean	Median
loading	19.71ms -> 19.41ms (0.302ms, z: 4.48128)	27.45ms -> 26.85ms
seeking	12.70ms -> 17.37ms (-4.663ms, z: 0.69236)	11.25ms -> 11.25ms
audio-track-reload	26.02ms -> 26.18ms (-0.161ms, z: 1.42568)	37.95ms -> 38.10ms
cold loading multithread	48.19ms -> 47.66ms (0.529ms, z: 9.50722)	70.50ms -> 69.60ms
seeking multithread	18.64ms -> 17.93ms (0.703ms, z: 1.10124)	10.35ms -> 10.35ms
audio-track-reload multithread	26.04ms -> 25.80ms (0.242ms, z: 3.10859)	38.25ms -> 37.95ms
hot loading multithread	15.47ms -> 15.40ms (0.065ms, z: 3.70142)	22.50ms -> 22.20ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

tests/performance/src/lib.js

Florent-Bouisset · 2025-01-22T16:55:55Z

tests/performance/src/lib.js

+  } else {
+    testNumber = Number(location.hash.substring(1));
+  }
+  if (testNumber < 100) {


I'm not sure what 100 is, is it the maximum number of tests that can be run?

Just to give some context:

The nodejs script in tests/perfomance.run.js is the test runner executed server-side (similar to vitest's globalSetup): here it runs one of the test page multiple time on the local Chrome binary

This script is the test library that will run in the browser alongside written tests (like vitest's JS lib when we do import { describe } from "vitest" ), it does the client-side test management logic (including sending HTTP requests to the nodejs script to report results) so test files can just be about the test scenarios

This testNumber is the number of time we will reload the page in the current browser. What we actually do here is to switch between the current.html and previous.html pages each time the tests for the current page are all finished.

Because we're reloading the page, we cannot rely on variables declared in that script to track that count. Here we use the "fragment" part of an URL to store the current iteration (.e.g. /current.html#34).
Once we've reached #100, we tell the nodejs script that we're done. It then kills the browser and either re-run another one or finish the tests.

I don't know if that's clear, it's a little complex as we retry the same tests a LOT of time in different processes at different time just to be sure we've some performance issue and it's not just a false positive due to temporary high CPU usage, some browser optimizations when similar code is repeated, or random noise.

How does it interact with TEST_ITERATIONS = 30 from performance/run.mjs ?
Does it means the browser test is started 30 times and each test does 100 reloads ? So it's actually 30x100 reloads. Isn't it too many?

Yes that must be the right count (technically 1500 with the previous and 1500 with the new one).

Too many for who?

That's to give more confidence into the final result.
When I looked at other benchmarking projects, 1500 attempts is in the very low tier.

The real issue I can see is that tests may run for 1h but that's not dramatic to me.

OK let's start with this value and adjust let's adjust it later if we are not satisfied.

Maybe the testNumber can be a parameter of declareTestGroup so it's explicit that the test will be repeated.

declareTestGroup("content loading monothread", () => { .. }, {timeout: 2000, repeat: 100})

In which case we wouldn't repeat some test groups once enough iterations have been done?

This would mean that the context in which other test groups are running may be completely different (e.g. the browser may have optimized some RxPlayer code paths when another test group had run) also, to take in account.

Test groups should not rely on another test group anyway, the test should succeed if it was launched alone or if there was other test group that has run before

Yes for the test passing part, but considering that the same page will run all test groups, the browser will e.g. execute TEST GROUP 1 then TEST GROUP 2, and other times just TEST GROUP 2, under that scenario.

The browser might have performed optimizations when JITing that JS and fed cache (even the RxPlayer has e.g. a codec checking cache) meaning that even if TEST GROUP 1 does not explicitly interfere with TEST GROUP 2, the former will have an effect in performance to the second.

In that aspect, running TEST GROUP 1 then TEST GROUP 2 and just running TEST GROUP 2 might not give the same performance results for TEST GROUP 2. Though I don't know what the right approach would be (only one test group per page could be a possibility).

Though most of those optimizations could also be kept as the page is reloaded as the loaded JS didn't change (which the browser can easily check for), so this is a complex subject.

Florent-Bouisset · 2025-01-23T09:16:49Z

tests/performance/src/lib.js

+  } else {
+    testNumber = Number(location.hash.substring(1));
+  }
+  if (testNumber < 100) {


How does it interact with TEST_ITERATIONS = 30 from performance/run.mjs ?
Does it means the browser test is started 30 times and each test does 100 reloads ? So it's actually 30x100 reloads. Isn't it too many?

tests/performance/run.mjs

github-actions · 2025-01-23T13:50:21Z

Automated performance checks have been performed on commit b75076e4b7f12b3ad60400d151b906f4235dc500 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name	Mean	Median
loading	19.98ms -> 19.90ms (0.084ms, z: 3.58443)	27.60ms -> 27.45ms
seeking	18.81ms -> 12.25ms (6.551ms, z: 1.22956)	11.40ms -> 11.55ms
audio-track-reload	26.39ms -> 26.44ms (-0.046ms, z: 0.36668)	38.70ms -> 38.70ms
cold loading multithread	48.52ms -> 48.03ms (0.490ms, z: 9.43020)	71.25ms -> 70.05ms
seeking multithread	15.36ms -> 15.38ms (-0.029ms, z: 0.31255)	10.50ms -> 10.50ms
audio-track-reload multithread	26.22ms -> 26.12ms (0.101ms, z: 2.32838)	38.55ms -> 38.40ms
hot loading multithread	15.65ms -> 15.57ms (0.078ms, z: 3.17081)	22.65ms -> 22.35ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

peaBerberian force-pushed the misc/performance-tests-all-pr branch 6 times, most recently from 38fc9eb to 4fef17b Compare January 14, 2025 14:01

peaBerberian force-pushed the misc/performance-tests-libs branch 2 times, most recently from 1f99e84 to b5f4e63 Compare January 14, 2025 14:19

peaBerberian changed the base branch from misc/performance-tests-libs to dev January 14, 2025 14:20

peaBerberian force-pushed the misc/performance-tests-all-pr branch 9 times, most recently from 963e4cc to fa4dc9e Compare January 17, 2025 09:48

peaBerberian mentioned this pull request Jan 17, 2025

Improve performance tests writing #1629

Closed

peaBerberian force-pushed the misc/performance-tests-all-pr branch 6 times, most recently from 4e13626 to fdb9406 Compare January 17, 2025 14:13

canalplus deleted a comment from github-actions bot Jan 17, 2025

peaBerberian force-pushed the misc/performance-tests-all-pr branch 4 times, most recently from 171f783 to d330f9d Compare January 17, 2025 14:53

peaBerberian added 6 commits January 21, 2025 17:14

perf: Try in gh comment

01e5d6a

Perf: Fix typo

e5b683c

perf: Update output. Also report not significative changes

5940dc7

perf: Skip if another commit or label

5e7e422

perf: Update slightly perf comment output

accdad5

perf: Minimal wording update

773e28d

peaBerberian force-pushed the misc/performance-tests-all-pr branch from d3c562b to 773e28d Compare January 21, 2025 16:15

peaBerberian added 2 commits January 21, 2025 18:08

perf: report is under a markdown table format

b447496

perf: Remove unnecessary newline

e591762

canalplus deleted a comment from github-actions bot Jan 21, 2025

perf: Add some newlines in test reports

23edde8

peaBerberian force-pushed the misc/performance-tests-all-pr branch from 66d9814 to 23edde8 Compare January 21, 2025 17:42

perf: remove not useful perf percentage

4473a43

canalplus deleted a comment from github-actions bot Jan 21, 2025

perf: fix case of median

5798686

canalplus deleted a comment from github-actions bot Jan 21, 2025

perf/CI: Fix label condition

7458f02

peaBerberian force-pushed the misc/performance-tests-all-pr branch from 628dc3b to 7458f02 Compare January 22, 2025 13:31

Florent-Bouisset reviewed Jan 22, 2025

View reviewed changes

Florent-Bouisset reviewed Jan 23, 2025

View reviewed changes

tests/performance/run.mjs Show resolved Hide resolved

perf: Use the fragment of the URL both for port and retry attempt

b75076e

Florent-Bouisset approved these changes Jan 28, 2025

View reviewed changes

peaBerberian merged commit 9cf989f into dev Jan 28, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run performance tests on all PR against base branch #1630

Run performance tests on all PR against base branch #1630

peaBerberian commented Jan 14, 2025 •

edited

Loading

Florent-Bouisset commented Jan 21, 2025

Performance tests 1st run output:

peaBerberian commented Jan 21, 2025 •

edited

Loading

github-actions bot commented Jan 21, 2025

github-actions bot commented Jan 22, 2025

github-actions bot commented Jan 22, 2025

Florent-Bouisset Jan 22, 2025

peaBerberian Jan 22, 2025 •

edited

Loading

Florent-Bouisset Jan 23, 2025

peaBerberian Jan 23, 2025 •

edited

Loading

Florent-Bouisset Jan 23, 2025

peaBerberian Jan 23, 2025 •

edited

Loading

Florent-Bouisset Jan 24, 2025

peaBerberian Jan 24, 2025

Florent-Bouisset Jan 23, 2025

github-actions bot commented Jan 23, 2025

Run performance tests on all PR against base branch #1630

Run performance tests on all PR against base branch #1630

Conversation

peaBerberian commented Jan 14, 2025 • edited Loading

Florent-Bouisset commented Jan 21, 2025

Performance tests 1st run output:

peaBerberian commented Jan 21, 2025 • edited Loading

github-actions bot commented Jan 21, 2025

Tests results

Performance tests 1st run output

github-actions bot commented Jan 22, 2025

Tests results

Performance tests 1st run output

github-actions bot commented Jan 22, 2025

Tests results

Performance tests 1st run output

Florent-Bouisset Jan 22, 2025

Choose a reason for hiding this comment

peaBerberian Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

Florent-Bouisset Jan 23, 2025

Choose a reason for hiding this comment

peaBerberian Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Florent-Bouisset Jan 23, 2025

Choose a reason for hiding this comment

peaBerberian Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Florent-Bouisset Jan 24, 2025

Choose a reason for hiding this comment

peaBerberian Jan 24, 2025

Choose a reason for hiding this comment

Florent-Bouisset Jan 23, 2025

Choose a reason for hiding this comment

github-actions bot commented Jan 23, 2025

Tests results

Performance tests 1st run output

peaBerberian commented Jan 14, 2025 •

edited

Loading

peaBerberian commented Jan 21, 2025 •

edited

Loading

peaBerberian Jan 22, 2025 •

edited

Loading

peaBerberian Jan 23, 2025 •

edited

Loading

peaBerberian Jan 23, 2025 •

edited

Loading