-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow configure option to link to tcmalloc/ jemalloc #17007
Conversation
The reason we've never done this so far is that it introduces incompatibilities between node and add-ons. A pointer allocated in A can't be freed from B unless A and B use the same allocator. If node is linked against jemalloc and an add-on against libc, things go boom. Likewise, even when an add-on picks up the allocator from the node binary but is linked against a library that doesn't: boom! That means this PR is only useful in custom builds and even then only with restrictions. The help options should mention that. |
@bnoordhuis I will close this. @uttampawar said he wants to add performance results to it. |
Here are results from using jemalloc vs the default libc malloc on node master git# 6aac05b Results are from running on a skylake server core, ubuntu 16.10, kernel 4.9.0-rc8 |
@bnoordhuis That's how I executed it Maybe something changed in the compare.R output. I can dump the raw csv into another gist, if you would like to test it out. |
ab5a3d3
to
f79310b
Compare
jemalloc and tcmaloc memory allocators provide better performance on small sized allocations. Note that, the comments in the help section clearly warn the user that, when linking against jemalloc/tcmalloc, alloc() and free() should route to the same allocator to prevent a crash. GH Issue nodejs#17007 has the performance report attached for jemalloc.
f79310b
to
e7473ed
Compare
@sathvikl when looking at your results there are some that seem to benefit from the change and others that regress due to it. Is there really a strong benefit? Especially due to the mentioned downside which is a significant blocker as far as I see it. |
@BridgeAR The benchmark runs are quite unreliable. However the +/- 10% delta pattern on some cases were consistent across runs. I would suggest to keep it as a configure option, so a user can easily experiment with their workload if they choose to do so. The effect on back-end API services workload is what I was targeting but we don't have a good proxy workload for that scenario. |
@sathvikl I personally tend to a -0 on this. So far no other collaborator has really gone for it, so I guess this is not going to land. I am therefore going to close the PR. Thanks a lot for your work nevertheless! It is much appreciated! In case you feel strongly about this PR and would still like to get it landed, please leave a comment. In that case I am going to reopen the issue and escalate it to the TSC to get a decision. |
@BridgeAR We can always at least ping @nodejs/collaborators in these cases and see whether anybody has an opinion. This looks good to me code-wise, but I share Ben’s concern about the changed addons API here. |
I have been searching for patterns where pointers allocated in node's C/C++ layer escape into native add-ons and vice-versa but could not figure out any. Moreover, the param values shuttled between JS and add-ons are almost always either JS objects or scalar types? So to me, the core layer and add-on layer operate in their own space in terms of self-managed memory, and thereby this change seems safe. @bnoordhuis , @addaleax - can you please clarify if this is not the case? |
Reopening due to the comment from @gireeshpunathil |
Note that, at least on Linux, it is already possible to use alternative memory allocators using the I also share Ben's concerns. There is at least one V8 API that transfers ownership of a returned pointer allocated by V8 to the caller. An addon using a different allocator would not work well with such an API. |
thanks @ofrobots .
|
This may also be useful to work around issues like #8871 |
@gireeshpunathil Like Ben said, an char* foo = malloc(100);
Buffer::New(isolate, foo, 100); |
thanks @addaleax for the explanation. So essentially it means interoperability is guaranteed only if one of:
is true. @sathvikl - can you validate the second point? |
That's never the case. |
thanks @bnoordhuis . I was under the impression that every malloc implementation stores the number of bytes it allocated that leaves us with only one option (if we want to go ahead with this PR) of documenting this limitation around the new build configurations with Given that:
are we good with the documentation approach? |
There is another issue that should be addressed: Lines 2684 to 2689 in 95a35bc
That's just one example, there are probably more, and openssl may not be the only library either. We can either disable those apis when linking against jemalloc/tcmalloc, or make configure print a warning or abort. A warning might be too easy to miss. This issue is a lot more subtle: it builds just fine but blows up at runtime, but not in a predictable manner. |
I think this is the right problem to go solve. IMO |
thanks all for the input - with so many if's and but's, looks like this PR cannot be realized without breaking the runtime consistency. |
@gireeshpunathil can we close the PR now? |
I guess so - yes. |
I want to thank @sathvikl for their creative thought and the effort behind this PR - unfortunately the interoperability is a challenge! |
Thank you @gireeshpunathil and everyone for their thoughts on this. I believe, the issue that would be very hard to handle would be cases where npm/yarn install would build as per the module settings. |
I felt something wired, this is the feature I have been looking for years in openjdk. |
Hi, Ben. What does this mean? If I load |
That comment wasn't about LD_PRELOAD but about building node with jemalloc. |
Both the memory allocators provide better performance on small sized
allocations.
Provide user options to link to jemalloc, tcmalloc.
Both of them have been known to provide better performance for small sized memory
allocations. The impact on long running apps must be evaluated before enabling it by default on production builds of node binary.
Earlier versions of tcmalloc cause fragmentation leading to overhead in actual physical memory consumed for the same given number of bytes allocated.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
Build