Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV SEGV_ACCERR on node-chakracore #1633

Closed
therealkenc opened this issue Jan 25, 2017 · 13 comments
Closed

SIGSEGV SEGV_ACCERR on node-chakracore #1633

therealkenc opened this issue Jan 25, 2017 · 13 comments
Labels

Comments

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 25, 2017

I caught the first anniversary of ChakraCore blog post (time flies), and took another look at how it fares on WSL since last time. I think this might be related to #286; if so dup away. But there's no PROT_GROWSDOWN here, so I am wondering since no EINVAL whether their use-case is expected (assumed?) to work.

Repro steps here are easy. Confirmed it's okay on native. Failing sequence on the offending thread is:

mmap(NULL, 131072, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f1ea7ec0000
mmap(0x7f1ea7ec0000, 4096, 
   PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f1ea7ec0000
mprotect(0x7f1ea7ec0000, 4096, PROT_EXEC) = 0
mprotect(0x7f1ea7ec0000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
mprotect(0x7f1ea7ec0000, 4096, PROT_EXEC) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7f1ea7ec0fc0} ---
+++ killed by SIGSEGV (core dumped) +++
@stehufntdev
Copy link
Collaborator

Thanks for the report, enabling ChakraCore is on our radar but we haven't had a chance to investigate all of the issues. Can you please confirm what part of the repro steps are failing? I saw that one of the steps was running the unit tests and we know there are failures there.

@therealkenc
Copy link
Collaborator Author

therealkenc commented Jan 27, 2017

Sorry I should have been more clear on the repro instead of just linking. I didn't even run the unit tests; I had no illusions about those failing at least in part. The repro is just:

node -e "console.log('Hello from Node.js ' + process.jsEngine)"

You don't even need to build it; just download yesterday's nightly. Mostly I was hoping the strace above would point to a known gap in support. Here's the backtrace FWIW:

#0  0x00007fffff1f1d80 in __register_frame ()
   from /lib/x86_64-linux-gnu/libgcc_s.so.1
#1  0x00000000011bf892 in PDataManager::RegisterPdata(_RUNTIME_FUNCTION*, unsigned long, unsigned long, void**, unsigned int, unsigned int) ()
#2  0x00000000011a1dd1 in InterpreterThunkEmitter::NewThunkBlock() ()
#3  0x00000000011a1bba in InterpreterThunkEmitter::GetNextThunk(void**) ()
#4  0x0000000000d55e9f in Js::FunctionBody::GenerateDynamicInterpreterThunk()
    ()
#5  0x0000000000d55f70 in Js::FunctionBody::EnsureDynamicInterpreterThunk(Js::FunctionEntryPointInfo*) ()
#6  0x0000000000e44275 in Js::InterpreterStackFrame::EnsureDynamicInterpreterThunk(Js::ScriptFunction*) ()
#7  0x0000000000ef947c in Js::InterpreterStackFrame::DelayDynamicInterpreterThunk(Js::RecyclableObject*, Js::CallInfo, ...) ()
#8  0x00000000010a5d5e in amd64_CallFunction ()
#9  0x0000000000cfc5a8 in JsCallFunction ()
#10 0x0000000000bc91ce in jsrt::CallFunction<_JsErrorCode (void*, void**, unsigned short, void**)>(_JsErrorCode ( const&)(void*, void**, unsigned short, void**), void*, void**) (api=
    @0xcfc220: {_JsErrorCode (void *, void **, unsigned short, void **)} 0xcfc220 <JsCallFunction>, func=0x7ff7fb218240, result=0x7ffffffdc6c0)
    at ../deps/chakrashim/src/jsrtutils.h:407
#11 0x0000000000bc9157 in jsrt::CallFunction (func=0x7ff7fb218240,
    fffdc6c0) at ../deps/chakrashim/src/jsrtutils.h:428

@stehufntdev
Copy link
Collaborator

Thanks I was able to get a local repro and have some leads on the issue. Will report back when I have narrowed down the cause.

@stehufntdev
Copy link
Collaborator

stehufntdev commented Jan 31, 2017

The SIGSEGV above was caused by WSL preventing explicit reads of an execute only memory region during demand commit. After fixing that up I was able to run the example and the fix should be out to insider builds soon. If you want to try to work around this until the fix makes it out you can adjust the over commit setting with echo 2 > /proc/sys/vm/overcommit_memory.

Also hit another issue here where very large PROT_NONE memory regions are being handled differently and I'll look at addressing that separately.

@therealkenc
Copy link
Collaborator Author

Well that was quick. 🏆

@therealkenc
Copy link
Collaborator Author

mmap(NULL, 34359738368, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f340f460000

Right; that's still in there. Said memory region is "very large" if you have a 2GB Atom notebook. It's somewhat large if you have a 64GB consumer desktop, like me. And it is not all that large at all if you are the #1115 guy. Whichever way, it's a bug. Maybe mention to that to the chakra guys if you bump into them in the hall.

@stehufntdev
Copy link
Collaborator

Thanks for the follow-up Ken. We've been in contact with the chakra folks, working to get that scenario enabled.

Initial investigations show PROT_NONE has special handling on native Linux though not well documented (e.g. see OpenJDK comment here - https://lwn.net/Articles/627557/). Basically PROT_NONE just reserves address space which typically has low resource consumption compared to providing backing for the memory. On the NT side this is similar to using MEM_RESRVED followed by MEM_COMMIT and we need to translate this correctly inside of WSL.

@therealkenc
Copy link
Collaborator Author

therealkenc commented Feb 1, 2017

[edit] Not weird (nevermind, duh)

[edit Still] Curious: Any idea why I'm hitting the problem to begin with, given I have 64GB of physical RAM and am not over-committing in the first place?

No longer curious. Heuristic (ie '0') means whatever you want a "seriously wild allocation" to mean. Got it. If there is any consolation in all this, I can now use "Rusty Russel score of -4" in casual conversation.

@stehufntdev
Copy link
Collaborator

stehufntdev commented Feb 1, 2017

Are you running as root? If not you need to do sudo sh -c "echo '2' >> /proc/sys/vm/overcommit_memory" (e.g. http://stackoverflow.com/questions/84882/sudo-echo-something-etc-privilegedfile-doesnt-work-is-there-an-alterna).

@therealkenc
Copy link
Collaborator Author

therealkenc commented Feb 1, 2017

Yeah; I was just being an idiot re: root. Finger muscle memory failed me for some reason.

I also have to apologize for dissing the chakra guys. I only just figured out allocating flat memory space up front is kind of standard operating procedure. V8 just happens to be more dynamic about it (or at least less aggressive about it).

@benhillis
Copy link
Member

@therealkenc - it seems to be a common pattern in Linux to have absurdly large sparse memory allocations. We don't handle these very well currently but better support is on our backlog.

@therealkenc
Copy link
Collaborator Author

@stehufntdev - Might as well ping all of these. Also confirmed working in 15046, but is not noted in the 15042 release notes. Really appreciate all these being addressed!

@stehufntdev
Copy link
Collaborator

Thanks for the confirmation, will close this out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants