-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assertion failure in src/node_worker.cc with large number of workers #31614
Comments
In the assertion failure list, I see this too. |
/cc @nodejs/libuv |
Does the process runs out of file descriptors? The first assertion is node's problem; it assumes that How did you trigger the second one? Can you get a backtrace? |
(gdb) where
#0 0x00007ffff6e04377 in raise () from /lib64/libc.so.6
#1 0x00007ffff6e05a68 in abort () from /lib64/libc.so.6
#2 0x0000000000a9dea1 in node::Abort() ()
#3 0x0000000000a9df17 in node::Assert(node::AssertionInfo const&) ()
#4 0x0000000000b3ad12 in node::worker::Worker::Run() ()
#5 0x0000000000b3b880 in node::worker::Worker::StartThread(v8::FunctionCallbackInfo<v8::Value> const&)::{lambda(void*)#1}::_FUN(void*) ()
#6 0x00007ffff71a3ea5 in start_thread () from /lib64/libpthread.so.0
#7 0x00007ffff6ecc8cd in clone () from /lib64/libc.so.6 @bnoordhuis - thanks! the first assertion has this backtrace; not able to obtain for the second one. Looks like it is always triggered by a worker thread, while the main thread is already processing the first assertion failure, because the second one occurs only when first one is present (this is my guess, no proof) running out of file descriptors looks like a possibility; I will debug from that angle. |
If you turn on coredumps, you should be able to get a backtrace. You may need to select the right thread in gdb or just run |
(gdb) where
#0 0x00007f92514c7377 in raise () from /lib64/libc.so.6
#1 0x00007f92514c8a68 in abort () from /lib64/libc.so.6
#2 0x00007f92514c0196 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f92514c0242 in __assert_fail () from /lib64/libc.so.6
#4 0x0000000000a03319 in uv__close_nocheckstdio (fd=-24) at ../deps/uv/src/unix/core.c:556
#5 0x00000000013cd271 in uv__close_nocheckstdio (fd=fd@entry=-24) at ../deps/uv/src/unix/core.c:563
#6 0x00000000013de2f7 in uv__read_proc_meminfo (what=what@entry=0x20be946 "MemTotal:")
at ../deps/uv/src/unix/linux-core.c:1016
#7 0x00000000013df7f3 in uv_get_total_memory () at ../deps/uv/src/unix/linux-core.c:1043
#8 0x0000000000a0dc05 in node::SetIsolateCreateParamsForNode(v8::Isolate::CreateParams*) ()
#9 0x0000000000b3a3e7 in node::worker::Worker::Run() ()
#10 0x0000000000b3b930 in node::worker::Worker::StartThread(v8::FunctionCallbackInfo<v8::Value> const&)::{lambda(void*)#1}::_FUN(void*) ()
#11 0x00007f9251866ea5 in start_thread () from /lib64/libpthread.so.0
#12 0x00007f925158f8cd in clone () from /lib64/libc.so.6 this is the stack trace for the second assertion. Looks like the loop creation failed, but the worker creation sequence has progressed thus far, wrongly? when we run out of descriptors, the error is |
On the other hand, in its current form, #31621 will address this one too? |
@gireeshpunathil No, #31621 is unrelated to that second failure, but libuv/libuv#2645 should have fixed that assertion ( |
@addaleax - that is really promising! my copy does not have it; will check and confirm. [ since the recreate is consistent, the validation will be easy ] |
Instead of hard asserting throw a runtime error, that is more consumable. Fixes: nodejs#31614
I was debugging #23277 and came across this:
$ cat bar.js
$ node --max-old-space-size=100000 bar
I guess this has to do with libuv failure due to lack of memory, but can this be better handled?
/cc @nodejs/workers
The text was updated successfully, but these errors were encountered: