-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process,worker: fix process.exitCode handling for fatalException #21739
Conversation
@@ -475,6 +475,7 @@ | |||
try { | |||
if (!process._exiting) { | |||
process._exiting = true; | |||
process.exitCode = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If user code happens to set process.exitCode
prior to this, this will override the user provided value. It should likely only set the value if it is not already set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do believe this is not the case, as uncaughtException has its own code 1, so it should be correct to override whatever was the code before. Though this is probably a semver-major because of this, but that shouldn't be a problem imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... I can definitely see the logic on it but I definitely don't like overriding user provided values unexpectedly. If we go with this, can I ask that a note be added to the documentation along with a code comment indicating why it's ok to silently override any user provided value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine with me. But where should I put a documentation?
Also, it is already noted that unhandledException has code 1, so this will basically enforce this, so that any previously set code will not be used, only the ones set in 'unhandledException' callback, 'exit' callback etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But where should I put a documentation?
In the description for the uncaughtException
event in docs/api/process.md
. The documentation currently does not say anything about the process exit code. A note there that, should the event not be handled, the process will exit with exitCode = 1 even if the user had previously set a process.exitCode
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(btw, I see this as a limitation in the current documentation and not something that is introduced by this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if the user had previously set a process.exitCode value.
Oh, I thought that's it's a given that if some new error happen exitCode will be replaced with the value of the error and just thought that it was undocumented. The fact that you wasn't able to change the code is indeed a strange thing. Anyway, I'll add the doc soon.
src/node.cc
Outdated
// read it again, otherwise use default for uncaughtException 1 | ||
Local<String> exit_code = env->exit_code_string(); | ||
int code = process_object->Get(env->context(), exit_code) | ||
.ToLocalChecked()->Int32Value(env->context()).ToChecked(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 4 space indent in C/C++ code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, thx. I thought linter would catch this 🤔.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lundibundi Sadly, our C++ linter is very outdated and not very actively maintained … we do have https://github.com/nodejs/node/blob/master/CPP_STYLE_GUIDE.md, if it helps?
src/node.cc
Outdated
Local<String> exit_code = env->exit_code_string(); | ||
int code = process_object->Get(env->context(), exit_code) | ||
.ToLocalChecked()->Int32Value(env->context()).ToChecked(); | ||
if (code == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... this should likely only emit exit(1)
if code
is undefined
. It should be legitimate for an uncaught exception handler to set process.exitCode = 0
and have that be the actual exit code.
Also, I kind of don't like code duplication in both tests (for process and worker), so will it be okay to extract test cases (child1,2,3 etc) into test/fixtures? |
9109a57
to
1acb425
Compare
src/node.cc
Outdated
exit(1); | ||
} else { | ||
exit(code.ToLocalChecked()->Int32Value(env->context()).ToChecked()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This crashes for weird edge cases like this:
process.on("exit", () => {
process.exitCode = { [Symbol.toPrimitive]() { throw new Error(); } };
})
Maybe exit with 1 if code.IsEmpty() || !code->IsInt32()
and use a direction conversion to v8::Int32
here, like this:
Line 2134 in 1f16758
*result = val.As<v8::Int32>()->Value(); |
Does that sound okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that's pretty neat. Though, one question, is it safe to just Local<Value> code = process_object->Get(env->context(), exit_code).ToLocalChecked();
? Or should I do it step by step (empty check, then convert and int32 check).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lundibundi That operation can still fail if userland code were to install a getter that throws … obviously not something it should do, but yes, I’d prefer to do it step-by-step.
(Generally, we’ll need to shift a lot of our error handling code away from using ToLocalChecked()
/ToChecked()
, because worker.terminate()
can make just about anything throw that calls JS code…)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks, I'll update it soon.
src/node.cc
Outdated
|
||
Local<String> exit_code = env->exit_code_string(); | ||
int code = process_object->Get(env->context(), exit_code).ToLocalChecked() | ||
->Int32Value(env->context()).ToChecked(); | ||
|
||
if (exiting->IsTrue()) { | ||
return code; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you when you say the test was failing before, but do you know why this was happening? Does it point to a larger issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand the issue here is that there is no synchronization between worker-exit and worker-fatalException-exit (links at the end of the first post) and as a result we have
- FatalException -> fatal callback called in js land -> emit exit from there
- worker undestands that it is finishing, exits its loop and calls
EmitExit
Therefore we get 2 calls to 'exit' listeners. I'm not sure if 'just muting' second exit event is a good solution, hence my note in the end.
I believe you when you say the test was failing before
I'm not sure what you mean? Also I forgot to add a description of a test case for the bug (it is here in the tests and I'll update the first post soon)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be honest ,you are really so konwledgable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lundibundi I see, that makes sense. I think I’d have a slight preference to handle this using a flag on the Environment
object, since process._exiting
can just be modified by userland code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's surely better. I'll try to implement that, though there may be some problems as _exiting
is actually set from _fatalException
handler in js, so with _exiting
it is set if no handler and **before** 'exit' callback
, but with env it will be set if no handler and **after** all 'exit' callbacks
due to the way fatalException is handled in js.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax Well, as I was afraid of, worker finish coincides with FatalException (after FatalException calls js handler, worker understands that it has finished and starts its own sequence). So either we
- change the way they work with each other (no ideas here, add exitCode lock in env and block worker until FatalExeption finish?)
- change how _fatalException work (maybe split in pre that will check for uncaughtException handler and actually success and failure routes, though this looks kind of over the top and introduce additional cpp-js calls)
- maybe check for 'uncaughtException' listeners in the cpp-land beforehand if that's possible, but this one looks like a hack and not a solution
- or use
_exiting
for now
Obviously the latter looks okay, but this surely indicates the problem and may need additional research.
P.s: In order to investigate this I added a few logs in code (as Debug were not enough) and run a version on test-worker-uncaught-exception.js
without asserts, here is the output
Fatal exception calling js handler
Worker exited main loop
Worker emit exit
EmitExit start
EmitExit emit 'exit'
on error received: foo
Exit callback called twice in worker
exited with 1
EmitExit start
EmitExit emit 'exit'
Aclually I added env->exiting and appropriate checks/sets in EmitExit, FatalException here but the order of execution prevents it from running correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lundibundi I think there’s a bigger bug here – in test-worker-uncaught-exception.js
, the worker does not exit because of the uncaught exception, but because there’s no work to do after it. :/
I think we want a process.exit()
call at the end of the if (!caught) {
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax oh, that is indeed true, great catch thanks.
I think adding a Timeout? with assert.fail should be good enough to ensure that the worker exits. Though I'm not sure if timeout is a good idea (flakiness and such), is there any better way?
Yeah process.exit()
seems to work just fine there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lundibundi We could try to start some async operation before the uncaught throw
and see whether it’s executed? That latter part might be tricky, but that would be the general idea, I think
@lundibundi You can do that if it makes the most sense to you, yes :) |
1acb425
to
6ce385b
Compare
src/node.cc
Outdated
Local<String> exit_code = env->exit_code_string(); | ||
MaybeLocal<Value> maybe_code = | ||
process_object->Get(env->context(), exit_code); | ||
if (maybe_code.IsEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really important, but I’m kinda hoping you’ll stick around after your currently open PRs, so as a tip: I personally find it slightly annoying to first have a MaybeLocal<Something>
and then conditionally converting it to a Local<Value>
.
What I tend to do then is something along these lines:
Local<Value> value;
if (!object->Get(context, key).ToLocal(&value) || !value->IsInt32())
exit(1);
exit(value.As<Int32>()->Value());
I know it’s not an exact match to this situation but I hope it’s clear enough what I mean. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, MaybeLocal looks kind of limited. Though I like this neat way of using sce.
src/node.cc
Outdated
|
||
Local<String> exit_code = env->exit_code_string(); | ||
int code = process_object->Get(env->context(), exit_code).ToLocalChecked() | ||
->Int32Value(env->context()).ToChecked(); | ||
|
||
if (exiting->IsTrue()) { | ||
return code; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lundibundi I see, that makes sense. I think I’d have a slight preference to handle this using a flag on the Environment
object, since process._exiting
can just be modified by userland code.
6ce385b
to
773c9d8
Compare
773c9d8
to
5eb1a23
Compare
@addaleax I've cleaned up the commits. I've left |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really digging what you put together here. Thank you a lot!
Would you be interested in joining the @nodejs/workers team? It doesn’t come with any responsibilities, it’s just people who get notified for issues/PRs related to Worker code. :)
* set process.exitCode before calling 'exit' handlers so that there will not be a situation where process.exitCode !== code in 'exit' callback during uncaughtException handling * don't ignore process.exitCode set in 'exit' callback when failed with uncaughtException and there is no uncaughtException listener
Now we set it before the exit event, this allows to change the code inside the exit event (event with uncaughtException), therefore setting exitCode in worker is no longer needed.
Previously even after uncaught exception the worker would continue to execute until there is no more work to do.
5eb1a23
to
6a993f1
Compare
@addaleax Thanks for your support 😃. |
Landed in 998f9ff...7c2925e, thanks for the work! 🎉 |
* set process.exitCode before calling 'exit' handlers so that there will not be a situation where process.exitCode !== code in 'exit' callback during uncaughtException handling * don't ignore process.exitCode set in 'exit' callback when failed with uncaughtException and there is no uncaughtException listener PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Previously even after uncaught exception the worker would continue to execute until there is no more work to do. PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
* set process.exitCode before calling 'exit' handlers so that there will not be a situation where process.exitCode !== code in 'exit' callback during uncaughtException handling * don't ignore process.exitCode set in 'exit' callback when failed with uncaughtException and there is no uncaughtException listener PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Previously even after uncaught exception the worker would continue to execute until there is no more work to do. PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
set process.exitCode before calling 'exit' handlers so that there will
not be a situation where process.exitCode !== code in 'exit' callback
during uncaughtException handling
don't ignore process.exitCode set in 'exit' callback when failed with
uncaughtException and there is no uncaughtException listener
fix duplicate call of 'exit' callbacks in case of uncaught exception in worker
Checklist
make -j4 test
(UNIX) passesThis PR depends on #21713 for workers test.I've found a bug in workers where in case on uncaughtException 'exit' event is actually called twice. I think this is due to both, having
_fatalException
called presumably viaFatalException()
and usual exit from worker as I understand here which results in 2 calls to 'exit' callbacks.I have fixed it with second commit, but I'm not sure if it's a correct fix, so awaiting feedback.
Edit: The case for the bug is when worker exits with unhandled exception and there is an 'exit' event listener and no 'unhandledException' listeners. This way worker's local 'exit' callbacks will be called twice. (see test-worker-uncaught-exception.js). Current node 10.6.0 always fails on this.
/cc @addaleax