-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
investigate flaky test-performance-eventlooputil #35309
Comments
|
The test seems inherently flaky, likely on all or nearly all platforms. I don't even need to run it concurrently locally to get it to fail. Here's a stress test to see if we can get a handle on how often it fails. https://ci.nodejs.org/job/node-stress-single-test/185/ (EDIT: 1 failure in 856 runs so far on debian9-64. Might be interesting to see if it fails more or less often on slow machines vs. fast machines and if common.platformTimeout() helps. Also might be interesting to experiment with parallel test jobs to see if it is affected by being in parallel or not.) |
One more which has some more info in assert:
|
Fix two races in test-performance-eventlooputil resulting in a flaky test. elu1 was capture after start time t from spin look. If OS descides to reschedule the process after capturing t but before getting elu for >=50ms the spin loop is actually a nop. elu1 doesn't show this and as a result elut3 = eventLoopUtilization(elu1) results in elu3.active === 0. Moving capturing of t after capturing t, just before the spin look avoids this. Similar if OS decides to shedule a different process between getting the total elu from start and the diff elu showing the spin loop the check to verify that total active time is long then the spin loop fails. Exchanging these statements avoids this race. PR-URL: #36028 Fixes: #35309 Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
Fix two races in test-performance-eventlooputil resulting in a flaky test. elu1 was capture after start time t from spin look. If OS descides to reschedule the process after capturing t but before getting elu for >=50ms the spin loop is actually a nop. elu1 doesn't show this and as a result elut3 = eventLoopUtilization(elu1) results in elu3.active === 0. Moving capturing of t after capturing t, just before the spin look avoids this. Similar if OS decides to shedule a different process between getting the total elu from start and the diff elu showing the spin loop the check to verify that total active time is long then the spin loop fails. Exchanging these statements avoids this race. PR-URL: #36028 Fixes: #35309 Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
Fix two races in test-performance-eventlooputil resulting in a flaky test. elu1 was capture after start time t from spin look. If OS descides to reschedule the process after capturing t but before getting elu for >=50ms the spin loop is actually a nop. elu1 doesn't show this and as a result elut3 = eventLoopUtilization(elu1) results in elu3.active === 0. Moving capturing of t after capturing t, just before the spin look avoids this. Similar if OS decides to shedule a different process between getting the total elu from start and the diff elu showing the spin loop the check to verify that total active time is long then the spin loop fails. Exchanging these statements avoids this race. PR-URL: #36028 Fixes: #35309 Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
https://ci.nodejs.org/job/node-test-commit-linux/37149/nodes=centos7-64-gcc8/console
The text was updated successfully, but these errors were encountered: