Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io.js 3.0.0 memory leak #2308

Closed
novacrazy opened this issue Aug 5, 2015 · 69 comments
Closed

io.js 3.0.0 memory leak #2308

novacrazy opened this issue Aug 5, 2015 · 69 comments
Labels
buffer Issues and PRs related to the buffer subsystem. confirmed-bug Issues with confirmed bugs. memory Issues and PRs related to the memory management or memory footprint. stream Issues and PRs related to the stream subsystem.

Comments

@novacrazy
Copy link

See Unitech/pm2#1500 for the bug report filed with PM2 before I realized it was being caused by io.js.

With io.js 2.4.0 and 2.5.0 there were no memory leaks for that module. Perhaps it could have something to do with the new Buffer implementation?

@trygve-lie
Copy link

I've also got an app I'm test running at 3.0.0 and after 4 hours it uses the double amount of memory as it used on 2.5.0. And it seems to be climbing on 3.0.0.

@jbergstroem
Copy link
Member

Perhaps @ChALkeR, @shigeki or @jasisk could do some testing?

@shigeki
Copy link
Contributor

shigeki commented Aug 5, 2015

@jbergstroem Sorry, I'm just getting on board to SFO now.

@ofrobots
Copy link
Contributor

ofrobots commented Aug 5, 2015

Is there a test-case that be used to reproduce the memory leak?

@novacrazy
Copy link
Author

Not that I can think of, unless you already have a system running PM2. Up until the server machine crashes there are no error messages or logs to track it down with. It is most certainly an issue only on io.js 3.0.0, though. I reverted back to 2.5.0 and memory usage has been stable for almost 10 hours now.

@rvagg
Copy link
Member

rvagg commented Aug 5, 2015

fwiw you can't assume that high memory usage and growing early memory usage are a memory leak, the way V8 handles memory and GC changes over time so you may possibly be dealing with a "normal" memory profile. It'd be good to have data for a longer run.

@ofrobots
Copy link
Contributor

ofrobots commented Aug 5, 2015

Based on the original report in Unitech/pm2#1500, it seems that the growth rate is 30MB/min and the app runs out of memory in a couple of hours. This does smell like a memory leak, but It would be very hard to debug this, or to know if the memory leak is actually in io.js, without a test case that shows the problem. At the least, can you grab and compare heap-snapshots and see what kinds of objects are growing?

@novacrazy
Copy link
Author

Well, I changed nothing other than upgrading io.js, and tried out both old, current and pre-release versions of PM2. I even tried out bumping down my dependencies from the last time I upgraded those.

When viewed with htop, I saw 500+ MB (kept growing until the server crashed) for each process with 3.0.0, versus about 40-60MB stable per process with 2.5.0. I wouldn't call a 1000% memory increase with up to 2MB per second continuous growth at idle a "normal" curve.

Perhaps it is completely normal if the bug is in PM2 instead, and previous versions' GC still managed to collect it somehow while 3.0.0 doesn't. My actual application code and processes seems unaffected; only the PM2 processes have such high memory usage. The PM2 processes also handle all the cluster and load balancing stuff, which could be more affected by the new Buffer implementation than my code is. I honestly don't know, but I'm willing to give you any data I can possibly provide.

@jbergstroem
Copy link
Member

To people experiencing the same issue: The best way to drive this issue forward is more metrics over time and a reduced proof of concept we all can test.

@novacrazy
Copy link
Author

I'm sorry, but I honestly have no idea how to take a heap snapshot of the individual PM2 daemons.

@jbergstroem
Copy link
Member

@novacrazy try importing heapdump and send it a unix signal.

@novacrazy
Copy link
Author

Alright, I injected require('heapdump'); into Common.js so all the daemons should have it. Now I just need to wait a bit for things to accumulate and for my rural internet connection to download the files. I'll post the (probably massive) results in a little while.

@ofrobots
Copy link
Contributor

ofrobots commented Aug 5, 2015

Before you upload the heapdump, be aware that it contains the full JavaScript heap and may include private information (such as user data, keys, etc.) and the source code of your JavaScript functions.

@novacrazy
Copy link
Author

@ofrobots Very good advice, thank you.

Also, @bnoordhuis, heapdump fails to compile with io.js 3.0.0: http://pastebin.com/RVnY5Mrs

Not sure if that is my fault or part of the last V8 upgrade.

@bnoordhuis
Copy link
Member

I just closed out a similar issue 30 seconds ago. :-)

I don't think that's an issue with node-heapdump but with V8. It's a C++11 project and you'll need at least gcc 4.8 or clang 3.4. You could get away with older versions until now but no more.

@novacrazy
Copy link
Author

Ah, my gcc was at 5.1 but g++ was still at 4.6 for some reason. Fixed now, thanks. Continuing on...

@novacrazy
Copy link
Author

It's going much slower now that I'm actually looking at it. Go figure.

However, I'm not really seeing any net gain in JavaScript memory usage. Although the io.js 3.0.0 process is already about four times as large as the 2.5.0 version, the observable V8 heap sizes are the same.

@novacrazy
Copy link
Author

I let the process grow to about 450MB before I killed it, and took another heap snapshot before I did. The heap was actually smaller than it was when the process was around 150MB.

In all cases (I took about ten snapshots), the observable V8 heaps of the PM2 daemons never exceeded 11MB, so the memory must be allocated somewhere else.

@JerrySievert
Copy link

@rvagg, @bnoordhuis do you have some sort of writeup/discussion available as to what you did to move away from ExternalArrayType as part of this transition? it would be extremely helpful for other projects trying to move past v8 4.3.x, such as PLV8.

thanks!

@bnoordhuis
Copy link
Member

@JerrySievert Use typed arrays basically. If you look at the changes to src/node.cc in 70d1f32 (warning: big commit), you'll see that we swapped out v8::Object::SetIndexedPropertiesToExternalArrayData() with v8::ArrayBuffer + v8::Uint32Array.

@JerrySievert
Copy link

thanks @bnoordhuis, that helps a lot.

@silverwind
Copy link
Contributor

Looks I've just encounted this. Test case and details are in the issue over at mscdex/busboy#92.

I'm thinking it must be stream/gc related and I can confirm that in io.js 2.5.0, the process stays at around 60 MB as opposed to 3.0.0 which bloats up to the uploaded file size, seemingly making v8 unable to garbage collect any of it.

@silverwind silverwind added confirmed-bug Issues with confirmed bugs. stream Issues and PRs related to the stream subsystem. v8 engine Issues and PRs related to the V8 dependency. labels Aug 8, 2015
@silverwind silverwind changed the title Massive memory leaks for io.js 3.0.0 and PM2 module io.js 3.0.0 memory leak Aug 8, 2015
@silverwind
Copy link
Contributor

Heap snapshots before and after the leak. Not sure what I'm looking for in these, except it seems to have grown quite a bit on arrays.

@bnoordhuis
Copy link
Member

Looks alright to me, no leaks. If there is a memory leak of some kind and it's not some GC quirk, it's probably outside the JS heap.

@silverwind Quick check: what happens when you pass --max_old_space_size=32 on the command line? Does it run out of memory? What does valgrind --leak-check=yes --show-reachable=yes iojs app.js print?

If there is a way to easily reproduce it, that would be appreciated.

@silverwind
Copy link
Contributor

@bnoordhuis should be pretty easy to reproduce with this:

wget https://gist.githubusercontent.com/silverwind/10f076397348a64a72a5/raw/4d9b3d0a0f43a8ca85620822ac755fd40f25fae6/busboy-memory-test.js
npm i busboy
iojs busboy-memory-test

Upload a 1GB file in a second terminal:

dd if=/dev/zero of=bigfile bs=1 count=0 seek=1073741824
curl -F filedata=@bigfile http://localhost:6666/upload

The script will write ./upload. The file's size should stay resident in iojs' memory after the request is done. In fact, every upload increases the memory of the process by the amount uploaded.

bnoordhuis added a commit to bnoordhuis/io.js that referenced this issue Aug 13, 2015
Fixes: nodejs#2308
PR-URL: nodejs#2352
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
factoidforrest added a commit to factoidforrest/PM2 that referenced this issue Aug 13, 2015
bnoordhuis added a commit that referenced this issue Aug 17, 2015
In a few places dynamic memory was passed to the Buffer::New() overload
that makes a copy of the input, not the one that takes ownership.

This commit is a band-aid to fix the memory leaks.  Longer term, we
should look into using C++11 move semantics more effectively.

Fixes: #2308
PR-URL: #2352
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
bnoordhuis added a commit that referenced this issue Aug 17, 2015
The circular dependency problem that put them there in the first place
is no longer an issue.  Move them out of the public node_buffer.h header
and into the private node_internals.h header.

Fixes: #2308
PR-URL: #2352
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
bnoordhuis added a commit that referenced this issue Aug 17, 2015
Rename the three argument overload of Buffer::New() to Buffer::Copy()
and update the code base accordingly.  The reason for renaming is to
make it impossible to miss a call site.

This coincidentally plugs a small memory leak in crypto.getAuthTag().

Fixes: #2308
PR-URL: #2352
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
bnoordhuis added a commit that referenced this issue Aug 17, 2015
Fixes: #2308
PR-URL: #2352
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@ChALkeR ChALkeR added the memory Issues and PRs related to the memory management or memory footprint. label Aug 17, 2015
@silverwind
Copy link
Contributor

While the major leak will be fixed in the upcoming 3.1.0, I think there's still a few minor leaks around. Please post test cases if you have any!

@rvagg
Copy link
Member

rvagg commented Aug 18, 2015

screen shot 2015-08-18 at 5 33 19 pm

Current master (actually https://iojs.org/download/nightly/v3.0.1-nightly201508173645dc62ed/) looks pretty good, the https test is showing a possible very slow leak, it's creeping up really slowly but I'm not confident calling it a leak. Will leave it going and see what happens but the main leak(s) appear to be resolved.

@ChALkeR
Copy link
Member

ChALkeR commented Aug 18, 2015

@rvagg

the https test is showing a possible very slow leak

Just in case: could you confirm that there is no mistype? The 3.0.1 memory usage on your graph looks constant with https-ping (the bottom one) and very slightly growing over time with http-ping (the upper one) to me.

@rvagg
Copy link
Member

rvagg commented Aug 18, 2015

@ChALkeR sorry, you're absolutely correct, it's the http-ping that appears to be leaking while https-ping is stable, which is very strange!

http-ping stabilises at around 75k and then gradually, but steadily, adds 10k more by the end of the graph.

https-ping fluctuates between 71k and 72k for the entire length.

@rvagg
Copy link
Member

rvagg commented Aug 18, 2015

@kzc
Copy link

kzc commented Aug 18, 2015

it's the http-ping that appears to be leaking while https-ping is stable

http-ping has reached steady state in the last 6 hours.

It's probably just memory arena fragmentation or the new v8 engine performing GC less aggressively in this low memory test. It appears to have levelled off.

@rvagg - can you provide a link to your RSS ping test source code?

@silverwind
Copy link
Contributor

@kzc see #2423

On a sidenote, my test case with FormData was just invalidated, the issue seems to have been a unconsumed stream (https://gist.github.com/silverwind/54b3829142f93d1127bc#gistcomment-1552967)

@rvagg
Copy link
Member

rvagg commented Aug 18, 2015

ditto, and see https://github.com/rvagg/node-memtest/tree/master/test directly for source of the tests, rss is collected using ps so it works on osx and linux: https://github.com/rvagg/node-memtest/blob/master/memtest.sh#L50

@rvagg
Copy link
Member

rvagg commented Aug 19, 2015

screen shot 2015-08-20 at 9 05 17 am

Memory usage settled right down to stable in 3.0.1 (nightly), so call off the dogs on that one.

@silverwind
Copy link
Contributor

I'll close this one per the graph above. We don't have any more indicators for more major leaks like the one in 3.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. confirmed-bug Issues with confirmed bugs. memory Issues and PRs related to the memory management or memory footprint. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

No branches or pull requests