Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

child process : unexpected exit code when process is out of memory #12271

Closed
abenhamdine opened this issue Apr 7, 2017 · 20 comments
Closed

child process : unexpected exit code when process is out of memory #12271

abenhamdine opened this issue Apr 7, 2017 · 20 comments
Labels
memory Issues and PRs related to the memory management or memory footprint. process Issues and PRs related to the process subsystem. wip Issues and PRs that are still a work in progress.

Comments

@abenhamdine
Copy link

abenhamdine commented Apr 7, 2017

[edit by @refack]
suggested fix #12271 (comment)
[end edit]

Version: 7.8.0
Platform: Windows 10 64
Subsystem: Child processes

I fork some child processes to do some intensive calculations.
Because I likely have some memory leaks, I pass the following arguments to the forked processes :

 execArgv: process.execArgv.concat(['--expose-gc', '--max-executable-size=192', '--max-old-space-size=256', '--max-semi-space-size=2'])

and as expected, the processes run out of memory after a given time, with the following error :

<--- Last few GCs --->

[13196:000001C7DD2B3520]   120623 ms: Mark-sweep 249.8 (275.1) -> 249.8 (275.1) MB, 367.6 / 0.2 ms  deserialize GC in old space requested
[13196:000001C7DD2B3520]   121001 ms: Mark-sweep 249.8 (275.1) -> 249.8 (275.1) MB, 376.2 / 0.4 ms  deserialize GC in old space requested
[13196:000001C7DD2B3520]   121428 ms: Mark-sweep 249.8 (275.1) -> 249.8 (275.1) MB, 426.4 / 0.4 ms  deserialize GC in old space requested


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 000001994CFA66A1 <JS Object>
    1: createContext [vm.js:~41] [pc=0000029DC91AFC50](this=00000344694A26E9 <an Object with map 0000015E4FEAACB9>,sandbox=00000164890109D1 <an Object with map 000002F68947DEB1>)
    2: createSandbox(aka createSandbox) [C:\node-projects\payroll-app\server\payrollEngine\PayrollEngine.js:~750] [pc=0000029DC91C93C4](this=000001C562102201 <null>,bul=000000F3E1CB4639 <an Object with map 000000BA66...

FATAL ERROR: deserialize context Allocation failed - process out of memory

DEBUG 17:02:28 : manageCalcProcess.onChildExit() : child process 13196 exited with code 3

What has surprised me is that the child process exits with a code 3.

Per docs, code 3 seems pretty vague, and not related to memory shortage.

I wonder if it's intentional, and if it would be more useful to have a specific exit code in that case.

Edit : Perhaps it's because of the vm.createContext() which mess the code ?

@abenhamdine abenhamdine changed the title child process : exit code when process is out of memory child process : unexpected exit code when process is out of memory Apr 7, 2017
@Fishrock123 Fishrock123 added process Issues and PRs related to the process subsystem. question Issues that look for answers. labels Apr 7, 2017
@vsemozhetbyt vsemozhetbyt added the memory Issues and PRs related to the memory management or memory footprint. label Apr 7, 2017
@Trott
Copy link
Member

Trott commented May 27, 2017

@bnoordhuis @cjihrig @indutny

@richardlau
Copy link
Member

Suspect this is #12823 (comment)

@bnoordhuis
Copy link
Member

Yes, that sounds about right. Node kills itself with a SIGABRT signal on out-of-memory conditions. On UNIX systems, you get an exit code of 128 + signal 6 = 134 but on Windows it's exit code 3, presumably due to how the CRT emulates raise(SIGABRT).

@gireeshpunathil
Copy link
Member

Agree that Windows aborts map with exit code 3 - assigned by the CRT, not node.

At the same time, as @abenhamdine pointed out, the exit code listing does not seem to reflect this properly. Currently number 3 is defined for parse error. Either this needs to cover parse error | windows aborts, or signalled terminations in windows need to coin a different exit code to reflect the fact.

@gireeshpunathil
Copy link
Member

/cc @refack

@refack
Copy link
Contributor

refack commented May 28, 2017

@bnoordhuis
Copy link
Member

That doesn't seem to let you override the exit code though. Perhaps the simplest fix is this:

diff --git a/src/util.h b/src/util.h
index 175db0c..7791304 100644
--- a/src/util.h
+++ b/src/util.h
@@ -95,7 +95,7 @@ template <typename T> using remove_reference = std::remove_reference<T>;
 
 // Windows 8+ does not like abort() in Release mode
 #ifdef _WIN32
-#define ABORT_NO_BACKTRACE() raise(SIGABRT)
+#define ABORT_NO_BACKTRACE() _exit(134)
 #else
 #define ABORT_NO_BACKTRACE() abort()
 #endif

@gireeshpunathil
Copy link
Member

@bnoordhuis - does this have the risk of node silently exiting on assertion failures? no coredumps and no messages?

@bnoordhuis
Copy link
Member

I don't think so. The error message is printed before ABORT_NO_BACKTRACE() is invoked and core dumps aren't a thing on Windows, right?

@refack
Copy link
Contributor

refack commented May 29, 2017

I did a similar thing for @indutny for his mini-test

@gireeshpunathil
Copy link
Member

@bnoordhuis - thanks. I did not know that when node aborts in Windows, it does without a dump! Probably due to the JVM background I took it for granted that it does, just like any other OS.

Looking at what happens in Java, I see that they trap SIGABRT within the process and invoke MiniDumpWriteDump to capture one, and then unset the trap, re-send the signal to let it die.

That raises another question: does that mean that --abort-on-uncaught-exception is redundant in Windows? uncaught exception anyways captures the callstack, and with this flag, what is being provided additionally?

@refack
Copy link
Contributor

refack commented May 29, 2017

I did not know that when node aborts in Windows, it does without a dump!

I'm not sure that's totally correct. From what I saw abort() calls drwatson.exe who's behaviur is configurable. I assume what Java is doing is to override that, and make sure there's a minidump.

@refack refack added the good first issue Issues that are suitable for first-time contributors. label Jun 4, 2017
@refack
Copy link
Contributor

refack commented Jun 4, 2017

Since #12271 (comment) is a straightforward solution, I'm marking this a a good first contribution. Also added link to said comment in OP.

@jkantr
Copy link
Contributor

jkantr commented Jun 21, 2017

I see this has been here unreferenced for awhile... and technically would be a first contribution to core for me :) Can I take it up?

@jkantr
Copy link
Contributor

jkantr commented Jun 21, 2017

@refack it would appear actually this change would mean the common test API would have to be altered, specifically nodeProcessAborted():

https://github.com/nodejs/node/blob/76340e3f1007998c7cb9d69fa1a42d42663ca6c2/test/common/index.js#L609,L615

As a bonus, altering nodeProcessAborted will also allow to write a proper test for this as well I believe heh. But maybe it requires some more discussion?

I'm up for it, just making sure I'm barking down the right path :)

@refack
Copy link
Contributor

refack commented Jun 21, 2017

actually this change would mean the common test API would have to be altered, specifically nodeProcessAborted():

Makes sense. Good catch.

@refack refack added wip Issues and PRs that are still a work in progress. and removed good first issue Issues that are suitable for first-time contributors. labels Jun 21, 2017
@refack
Copy link
Contributor

refack commented Jun 21, 2017

@jkantr, I've flagged this as in progress indicating you "claimed it".

@jkantr
Copy link
Contributor

jkantr commented Jun 21, 2017

Sounds good, should be able to cobble something together in a few

@Trott Trott removed the question Issues that look for answers. label Aug 7, 2017
refack pushed a commit to jkantr/node that referenced this issue Aug 8, 2017
Raising SIGABRT is handled in the CRT in windows, calling _exit()
with ambiguous code "3" by default.

This adjustment to the abort behavior gives a more sane exit code
on abort, by calling _exit directly with code 134.

PR-URL: nodejs#13947
Fixes: nodejs#12271
Refs: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/abort
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
@nicojs
Copy link

nicojs commented Jul 31, 2018

Question: how could I detect that a child process ran out-of-memory. Should I check on the exit code (134) or should I scan the stderr output for "JavaScript heap out of memory"? Or is there a better way?

@jlenon7
Copy link
Member

jlenon7 commented Sep 22, 2021

@nicojs same doubt here. Nico, can you please explain how did you get this exit code 134 ? When I listen to the event child.on('exit', (code) => console.log(code)) I only got the exit code 0 and signal SIGABRT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory Issues and PRs related to the memory management or memory footprint. process Issues and PRs related to the process subsystem. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

No branches or pull requests