Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run performance tests on all PR against base branch #1630

Merged
merged 17 commits into from
Jan 28, 2025

Conversation

peaBerberian
Copy link
Collaborator

@peaBerberian peaBerberian commented Jan 14, 2025

This PR does multiple updates with our performance tests (which test for performance regressions):

  1. I simplified performance tests writing by adding a lib.js file handling test management - and I added some multithread tests.

  2. I now choose to run performance tests in CI for all PR

  3. Performance tests in CI are now run against the base branch (instead of against the last RxPlayer release )

  4. A comment is left on the GitHub PR if performance issues are detected

@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch 6 times, most recently from 38fc9eb to 4fef17b Compare January 14, 2025 14:01
@peaBerberian peaBerberian force-pushed the misc/performance-tests-libs branch 2 times, most recently from 1f99e84 to b5f4e63 Compare January 14, 2025 14:19
@peaBerberian peaBerberian changed the base branch from misc/performance-tests-libs to dev January 14, 2025 14:20
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch 9 times, most recently from 963e4cc to fa4dc9e Compare January 17, 2025 09:48
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch 6 times, most recently from 4e13626 to fdb9406 Compare January 17, 2025 14:13
@canalplus canalplus deleted a comment from github-actions bot Jan 17, 2025
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch 4 times, most recently from 171f783 to d330f9d Compare January 17, 2025 14:53
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch from d3c562b to 773e28d Compare January 21, 2025 16:15
@Florent-Bouisset
Copy link
Collaborator

Automated performance checks have found a sensible difference on commit 4896468988a863f1dc290bc065d5f9f3377e2c64 with the base branch dev:

Performance tests 1st run output:

No significative change in performance for tests:

  • loading mean: 19.28ms -> 19.41ms (-0.129ms, -0.666%, z: 1.47799) / median: 26.70ms -> 26.70ms
  • seeking mean: 12.04ms -> 14.77ms (-2.738ms, -18.532%, z: 1.16664) / median: 11.25ms -> 11.40ms
  • audio-track-reload mean: 25.96ms -> 26.08ms (-0.125ms, -0.478%, z: 1.08775) / median: 38.10ms -> 38.10ms
  • cold loading multithread mean: 48.07ms -> 47.49ms (0.589ms, 1.241%, z: 9.61539) / median: 70.20ms -> 69.15ms
  • seeking multithread mean: 12.62ms -> 23.30ms (-10.675ms, -45.815%, z: 0.25505) / median: 10.50ms -> 10.35ms
  • audio-track-reload multithread mean: 25.98ms -> 25.79ms (0.190ms, 0.736%, z: 2.12765) / median: 38.25ms -> 38.10ms
  • hot loading multithread mean: 15.30ms -> 15.24ms (0.059ms, 0.387%, z: 3.20104) / median: 22.20ms -> 22.05ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

Maybe we can have it displayed as a table so it's more readable:

Action Mean Median
Loading 19.28ms → 19.41ms (-0.129ms, -0.666%n, z: 1.455) 26.70ms → 26.70ms
Seeking 19.28ms → 19.41ms (-0.129ms, -0.666%, z: 1.455) 26.70ms → 26.70ms

@peaBerberian
Copy link
Collaborator Author

peaBerberian commented Jan 21, 2025

OK though to make it also readable as a text file (the intent is that the outputed report markdown file is both intended for humans with a text editor and to scripts outputing optionally richer text formats - like as a Github comment) is more complex when doing a table vs a list.

But i'll try

@canalplus canalplus deleted a comment from github-actions bot Jan 21, 2025
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch from 66d9814 to 23edde8 Compare January 21, 2025 17:42
@canalplus canalplus deleted a comment from github-actions bot Jan 21, 2025
Copy link

Automated performance checks have been performed on commit 5798686388a3dec9eb469f64e6b386b0d4358828 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 20.26ms -> 20.60ms (-0.337ms, z: 0.96412) 27.00ms -> 27.15ms
seeking 17.56ms -> 14.42ms (3.141ms, z: 2.34821) 11.10ms -> 11.25ms
audio-track-reload 26.22ms -> 26.43ms (-0.215ms, z: 0.20465) 37.65ms -> 37.65ms
cold loading multithread 48.59ms -> 47.85ms (0.736ms, z: 8.78710) 69.75ms -> 68.55ms
seeking multithread 20.70ms -> 17.48ms (3.221ms, z: 0.57881) 10.20ms -> 10.20ms
audio-track-reload multithread 26.30ms -> 26.02ms (0.280ms, z: 0.52428) 37.65ms -> 37.65ms
hot loading multithread 15.80ms -> 15.42ms (0.377ms, z: 3.90596) 22.05ms -> 21.75ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

@canalplus canalplus deleted a comment from github-actions bot Jan 21, 2025
Copy link

Automated performance checks have been performed on commit 628dc3be94709c1c96840e8d4b8a3d486ac58fd5 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 19.22ms -> 19.30ms (-0.079ms, z: 1.62917) 26.55ms -> 26.55ms
seeking 13.31ms -> 10.65ms (2.653ms, z: 0.09964) 11.25ms -> 11.25ms
audio-track-reload 25.84ms -> 25.88ms (-0.042ms, z: 0.13650) 37.65ms -> 37.65ms
cold loading multithread 47.96ms -> 47.10ms (0.856ms, z: 12.99353) 70.20ms -> 68.85ms
seeking multithread 16.55ms -> 18.57ms (-2.022ms, z: 0.72020) 10.50ms -> 10.35ms
audio-track-reload multithread 25.68ms -> 25.73ms (-0.050ms, z: 0.30880) 37.65ms -> 37.80ms
hot loading multithread 15.20ms -> 15.19ms (0.016ms, z: 3.84216) 22.10ms -> 21.90ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch from 628dc3b to 7458f02 Compare January 22, 2025 13:31
Copy link

Automated performance checks have been performed on commit 7458f0240d24c9dfacff9836a9e0dc7adeec0f4a with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 19.71ms -> 19.41ms (0.302ms, z: 4.48128) 27.45ms -> 26.85ms
seeking 12.70ms -> 17.37ms (-4.663ms, z: 0.69236) 11.25ms -> 11.25ms
audio-track-reload 26.02ms -> 26.18ms (-0.161ms, z: 1.42568) 37.95ms -> 38.10ms
cold loading multithread 48.19ms -> 47.66ms (0.529ms, z: 9.50722) 70.50ms -> 69.60ms
seeking multithread 18.64ms -> 17.93ms (0.703ms, z: 1.10124) 10.35ms -> 10.35ms
audio-track-reload multithread 26.04ms -> 25.80ms (0.242ms, z: 3.10859) 38.25ms -> 37.95ms
hot loading multithread 15.47ms -> 15.40ms (0.065ms, z: 3.70142) 22.50ms -> 22.20ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

tests/performance/src/lib.js Show resolved Hide resolved
} else {
testNumber = Number(location.hash.substring(1));
}
if (testNumber < 100) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what 100 is, is it the maximum number of tests that can be run?

Copy link
Collaborator Author

@peaBerberian peaBerberian Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to give some context:

  • The nodejs script in tests/perfomance.run.js is the test runner executed server-side (similar to vitest's globalSetup): here it runs one of the test page multiple time on the local Chrome binary

  • This script is the test library that will run in the browser alongside written tests (like vitest's JS lib when we do import { describe } from "vitest" ), it does the client-side test management logic (including sending HTTP requests to the nodejs script to report results) so test files can just be about the test scenarios

This testNumber is the number of time we will reload the page in the current browser. What we actually do here is to switch between the current.html and previous.html pages each time the tests for the current page are all finished.

Because we're reloading the page, we cannot rely on variables declared in that script to track that count. Here we use the "fragment" part of an URL to store the current iteration (.e.g. /current.html#34).
Once we've reached #100, we tell the nodejs script that we're done. It then kills the browser and either re-run another one or finish the tests.

I don't know if that's clear, it's a little complex as we retry the same tests a LOT of time in different processes at different time just to be sure we've some performance issue and it's not just a false positive due to temporary high CPU usage, some browser optimizations when similar code is repeated, or random noise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it interact with TEST_ITERATIONS = 30 from performance/run.mjs ?
Does it means the browser test is started 30 times and each test does 100 reloads ? So it's actually 30x100 reloads. Isn't it too many?

Copy link
Collaborator Author

@peaBerberian peaBerberian Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that must be the right count (technically 1500 with the previous and 1500 with the new one).

Too many for who?

That's to give more confidence into the final result.
When I looked at other benchmarking projects, 1500 attempts is in the very low tier.

The real issue I can see is that tests may run for 1h but that's not dramatic to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let's start with this value and adjust let's adjust it later if we are not satisfied.

Maybe the testNumber can be a parameter of declareTestGroup so it's explicit that the test will be repeated.

declareTestGroup("content loading monothread", () => { .. }, {timeout: 2000, repeat: 100})

Copy link
Collaborator Author

@peaBerberian peaBerberian Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case we wouldn't repeat some test groups once enough iterations have been done?

This would mean that the context in which other test groups are running may be completely different (e.g. the browser may have optimized some RxPlayer code paths when another test group had run) also, to take in account.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test groups should not rely on another test group anyway, the test should succeed if it was launched alone or if there was other test group that has run before

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes for the test passing part, but considering that the same page will run all test groups, the browser will e.g. execute TEST GROUP 1 then TEST GROUP 2, and other times just TEST GROUP 2, under that scenario.

The browser might have performed optimizations when JITing that JS and fed cache (even the RxPlayer has e.g. a codec checking cache) meaning that even if TEST GROUP 1 does not explicitly interfere with TEST GROUP 2, the former will have an effect in performance to the second.

In that aspect, running TEST GROUP 1 then TEST GROUP 2 and just running TEST GROUP 2 might not give the same performance results for TEST GROUP 2. Though I don't know what the right approach would be (only one test group per page could be a possibility).

Though most of those optimizations could also be kept as the page is reloaded as the loaded JS didn't change (which the browser can easily check for), so this is a complex subject.

} else {
testNumber = Number(location.hash.substring(1));
}
if (testNumber < 100) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it interact with TEST_ITERATIONS = 30 from performance/run.mjs ?
Does it means the browser test is started 30 times and each test does 100 reloads ? So it's actually 30x100 reloads. Isn't it too many?

tests/performance/run.mjs Show resolved Hide resolved
Copy link

Automated performance checks have been performed on commit b75076e4b7f12b3ad60400d151b906f4235dc500 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 19.98ms -> 19.90ms (0.084ms, z: 3.58443) 27.60ms -> 27.45ms
seeking 18.81ms -> 12.25ms (6.551ms, z: 1.22956) 11.40ms -> 11.55ms
audio-track-reload 26.39ms -> 26.44ms (-0.046ms, z: 0.36668) 38.70ms -> 38.70ms
cold loading multithread 48.52ms -> 48.03ms (0.490ms, z: 9.43020) 71.25ms -> 70.05ms
seeking multithread 15.36ms -> 15.38ms (-0.029ms, z: 0.31255) 10.50ms -> 10.50ms
audio-track-reload multithread 26.22ms -> 26.12ms (0.101ms, z: 2.32838) 38.55ms -> 38.40ms
hot loading multithread 15.65ms -> 15.57ms (0.078ms, z: 3.17081) 22.65ms -> 22.35ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

@peaBerberian peaBerberian merged commit 9cf989f into dev Jan 28, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants