-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: fix flaky watchFile() test on Windows #21694
Conversation
Stress test on current master: https://ci.nodejs.org/job/node-stress-single-test/1930/ Hopefully it fails (thus demonstrating the issue and suggesting that a clean run with this PR would be a sign that it addresses it). But it might not. I'm not sure how frequent this race condition is. But there are other ways to trigger it if the above doesn't work. |
Another solution would be to remove the assertions, as the original bug was that the callback didn't run at all. Simply confirming that the callback is invoked is enough to indicate that the bug has been fixed. |
Stress test didn't work and I think it's because we don't have a support configuration. @nodejs/build Can we get win10 + vs2017 added and (if it's no longer supported) remove win10 + VS2015 from node-stress-single test? https://ci.nodejs.org/job/node-stress-single-test/1930/nodes=win10-1p-vs2015/console 16:58:15 [win10-1p-vs2015] $ cmd /c call C:\Users\ADMINI~1\AppData\Local\Temp\jenkins8802136331077346752.bat
16:58:15
16:58:15 c:\workspace\node-stress-single-test\nodes\win10-1p-vs2015>for /F "delims=. tokens=1" %F in ('call python tools\getnodeversion.py') do (set "NODE_MAJOR_VERSION=%F" )
16:58:15
16:58:15 c:\workspace\node-stress-single-test\nodes\win10-1p-vs2015>(set "NODE_MAJOR_VERSION=11" )
16:58:15
16:58:15 c:\workspace\node-stress-single-test\nodes\win10-1p-vs2015>if 11 GTR 5 if "vs2015" == "vs2013" (
16:58:15 echo Not building v11 with VS2013 as it is only supported up to v5.x
16:58:15 exit /b 0
16:58:15 )
16:58:15
16:58:15 c:\workspace\node-stress-single-test\nodes\win10-1p-vs2015>if 11 LSS 8 if "vs2015" == "vs2017" (
16:58:15 echo Not building v11 with VS2017 as it is only supported starting on v8.x
16:58:15 exit /b 0
16:58:15 )
16:58:15
16:58:15 c:\workspace\node-stress-single-test\nodes\win10-1p-vs2015>call vcbuild.bat release nosign x64
16:58:15 Note: vcbuild no longer signs by default. "nosign" is redundant.
16:58:15 Looking for Python 2.x
16:58:16 Looking for NASM
16:58:16 Looking for Visual Studio 2017
16:58:16 Failed to find a suitable Visual Studio installation.
16:58:16 Try to run in a "Developer Command Prompt" or consult
16:58:16 https://github.com/nodejs/node/blob/master/BUILDING.md#windows-1
16:58:16 Build step 'Conditional steps (multiple)' marked build as failure
16:58:16 Notifying upstream projects of job completion
16:58:16 Finished: FAILURE |
Failed once in 19200 runs at https://ci.nodejs.org/job/node-stress-single-test/1948/nodes=win10-vs2017/consoleText. So we'll need a heck of a lot of runs to show reliable failure and then clean success with a different approach. |
OK, 13 failures in 51200 runs. That shows the master branch (with the test moved to parallel) fails from time to time: https://ci.nodejs.org/job/node-stress-single-test/1951/nodes=win10-vs2017/consoleText Now to run a stress test on a fix... |
Stress test against this pull request: https://ci.nodejs.org/job/node-stress-single-test/1956/nodes=win10-vs2017/ |
Stress test was successful! Remaining tasks before this can land:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rubberstamp LGTM.
The regression that test-fs-watch-file-enoent-after-deletion was written to test for involves whether or not the callback runs or not. Checking what the file watcher reports unfortunately is subject to race conditions on Windows (and possibly elsewhere) because the file watcher returns control to the event loop before it may be receiving data from the OS. So remove those assertions. The test still checks what it is supposed to check, but is no longer subject to race conditions.
Confirmed that this version of the test (rewritten to remove ES6-isms) passes in 0.8.10 and never exits (so fails) in 0.8.9. This is what we want/expect. 'use strict';
// This test is broken in 0.8.9--it never exits.
// This test passes in 0.8.10 and more recent because the bug in 0.8.9 is fixed.
// Make sure the deletion event gets reported in the following scenario:
// 1. Watch a file.
// 2. The initial stat() goes okay.
// 3. Something deletes the watched file.
// 4. The second stat() fails with ENOENT.
// The second stat() translates into the first 'change' event but a logic error
// stopped it from getting emitted.
// https://github.com/nodejs/node-v0.x-archive/issues/4027
var path = require('path');
var fs = require('fs');
var filename = path.join('/tmp', 'watched');
try {
fs.unlinkSync(filename);
} catch (e) {
// swallow error, it's probably just that the file doesn't exist because
// you haven't run this test before
}
fs.writeFileSync(filename, 'quis custodiet ipsos custodes');
fs.watchFile(filename, { interval: 50 }, function(curr, prev) {
fs.unwatchFile(filename);
});
setTimeout(fs.unlinkSync, 300, filename); |
Same results with 0.8.9 and 0.8.10 with the timer stripped out. Woo hoo! 'use strict';
// This test is broken in 0.8.9--it never exits.
// This test passes in 0.8.10 and more recent because the bug in 0.8.9 is fixed.
// Make sure the deletion event gets reported in the following scenario:
// 1. Watch a file.
// 2. The initial stat() goes okay.
// 3. Something deletes the watched file.
// 4. The second stat() fails with ENOENT.
// The second stat() translates into the first 'change' event but a logic error
// stopped it from getting emitted.
// https://github.com/nodejs/node-v0.x-archive/issues/4027
var path = require('path');
var fs = require('fs');
var filename = path.join('/tmp', 'watched');
try {
fs.unlinkSync(filename);
} catch (e) {
// swallow error, it's probably just that the file doesn't exist because
// you haven't run this test before
}
fs.writeFileSync(filename, 'quis custodiet ipsos custodes');
fs.watchFile(filename, { interval: 50 }, function(curr, prev) {
fs.unwatchFile(filename);
});
fs.unlinkSync(filename); |
The timer was there to address a race condition that has been removed from the test. Remove it.
Stress test on the new version that has no timer. https://ci.nodejs.org/job/node-stress-single-test/1959/nodes=win10-vs2017/ |
Stress test without the timer came back green! 🎉 |
ping @apapirovski |
Note to self (or whoever lands this): add |
The regression that test-fs-watch-file-enoent-after-deletion was written to test for involves whether or not the callback runs or not. Checking what the file watcher reports unfortunately is subject to race conditions on Windows (and possibly elsewhere) because the file watcher returns control to the event loop before it may be receiving data from the OS. So remove those assertions. The test still checks what it is supposed to check, but is no longer subject to race conditions. Fixes: nodejs#21692 PR-URL: nodejs#21694 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
The timer was there to address a race condition that has been removed from the test. Remove it. PR-URL: nodejs#21694 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Landed in 39977db...6d60c93 |
The regression that test-fs-watch-file-enoent-after-deletion was written to test for involves whether or not the callback runs or not. Checking what the file watcher reports unfortunately is subject to race conditions on Windows (and possibly elsewhere) because the file watcher returns control to the event loop before it may be receiving data from the OS. So remove those assertions. The test still checks what it is supposed to check, but is no longer subject to race conditions. Fixes: #21692 PR-URL: #21694 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
The timer was there to address a race condition that has been removed from the test. Remove it. PR-URL: #21694 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Detect race condition and retry test when it occurs.
Confirmed that a version of the test (modified to remove ES6-isms)
fails on 0.8.9 (which is expected as that was the last version with the
bug that the test was written to detect) and succeeds on 0.8.10.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes