-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libgearman serious memory leak #163
Comments
@sinkingsun2 Could you upload your valgrind logs and/or give instructions on how to reproduce? Thanks! Also, the latest release is 1.1.18. Have you tried that? |
You added that free before the Line 325 in e2d76cf
|
Also, what compiler (including the version number) are you using? |
No. I didn’t check 1.1.18 because from the release note there aren’t many
changes. I don’t believe it can fix the memory leak issue. So I didn’t try
it.
Some logs I shared with my manager:
==24006== 8,644,802 (16 direct, 8,644,786 indirect) bytes in 1 blocks are
definitely lost in loss record 27 of 28
==24006== at 0x4C29C23: malloc (vg_replace_malloc.c:299)
==24006== by 0x4E6AA18: nm_malloc (nm_alloc.c:27)
==24006== by 0x4E6F5B9: add_object_to_objectlist (objectlist.c:26)
==24006== by 0x17962387: ???
==24006== by 0x1795FF18: ???
==24006== by 0x17BBEAF6: ???
==24006== by 0x17BC7313: ???
==24006== by 0x1796011F: ???
==24006== by 0x5F2CE24: start_thread (in /usr/lib64/libpthread-2.17.so)
==24006== by 0x59F4BAC: clone (in /usr/lib64/libc-2.17.so)
==24006==
==24006== 25,567,144 bytes in 40,333 blocks are definitely lost in loss
record 28 of 28
==24006== at 0x4C29C23: malloc (vg_replace_malloc.c:299)
==24006== by 0x17BC1679: ???
==24006== by 0x17BC1A97: ???
==24006== by 0x17BB7863: ???
==24006== by 0x17BB7A13: ???
==24006== by 0x17BB8F4A: ???
==24006== by 0x17959908: ???
==24006== by 0x17961108: ???
==24006== by 0x4E69F09: neb_invoke_callback (nebmods.c:606)
==24006== by 0x4E69F09: neb_make_callbacks_full (nebmods.c:636)
==24006== by 0x4E69FB3: neb_make_callbacks (nebmods.c:671)
==24006== by 0x4E4A93E: broker_service_check (broker.c:282)
==24006== by 0x4E4F276: run_scheduled_service_check
(checks_service.c:215)
==24006==
==24006== LEAK SUMMARY:
==24006== definitely lost: 25,567,393 bytes in 40,366 blocks
==24006== indirectly lost: 8,644,786 bytes in 72,639 blocks
==24006== possibly lost: 1,845 bytes in 3 blocks
==24006== still reachable: 212,113 bytes in 20 blocks
==24006== suppressed: 0 bytes in 0 blocks
==24006== Reachable blocks (those to which a pointer was found) are not
shown.
94.14% (244,741,344B) (heap allocation functions) malloc/new/new[],
--alloc-fns, etc.
->59.43% (154,513,632B) 0x82C2678: ???
| ->59.43% (154,513,632B) 0x82C2A96: ???
| | ->59.43% (154,513,632B) 0x82B8862: ???
| | ->59.43% (154,513,632B) 0x82B8A12: ???
| | ->59.43% (154,513,632B) 0x82B9F49: ???
| | ->59.43% (154,513,632B) 0x805A907: ???
| | ->59.43% (154,513,632B) 0x8062107: ???
| | ->59.43% (154,513,632B) 0x4E60F08: neb_make_callbacks_full
(nebmods.c:606)
| | ->59.43% (154,513,632B) 0x4E60FB2: neb_make_callbacks
(nebmods.c:671)
| | ->59.43% (154,513,632B) 0x4E4193D: broker_service_check
(broker.c:282)
Ed Sabol <notifications@github.com>于2018年8月10日 周五上午11:59写道:
… @sinkingsun2 <https://github.com/sinkingsun2> Could you upload your
valgrind logs and/or give instructions on how to reproduce? Thanks!
Also, the latest release is 1.1.18. Have you tried that?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#163 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AV0KyDlsHKbForNvhV8qmjX0wIIqgiEdks5uPQULgaJpZM4Vyj8D>
.
|
I don’t think compiler version matters because it’s obvious that the memory
is allocated and there is no code to free them.
The place is correct, before the if statement.
Ed Sabol <notifications@github.com>于2018年8月10日 周五上午11:59写道:
… @sinkingsun2 <https://github.com/sinkingsun2> Could you upload your
valgrind logs and/or give instructions on how to reproduce? Thanks!
Also, the latest release is 1.1.18. Have you tried that?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#163 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AV0KyDlsHKbForNvhV8qmjX0wIIqgiEdks5uPQULgaJpZM4Vyj8D>
.
|
This might be an issue in mod-gearman. We usually run Mod-Gearman with libgearman 0.35 and i didn't had the time to test any new releases successfully. So if something in the clients memory handling has to change, which isn't unlikely, then Mod-Gearman needs to be patched in order to run smoothly with recent libgearman versions. |
yeah, mod-gearman/naemon/gearmand, they invoke each other, it's hard to tell which caused the problem and where is the best place to fix it. but if you just download gearmand-1.1.17 and mod_gearman-3.0.6 and naemon-1.0.8, you should be able to reproduce the memory leak. In my case after I added that free code in add.cc, there is no leak anymore. |
If anyone could reproduce this with just gearmand and some minimal clients and workers, we’d know for sure that the fix should be here in gearmand. As it is, it’s not clear, I think? |
Hi! Please submit a PR with your fix. I know it's just a one-liner, but I'd like to have your name on the commits and your analysis in the commit message. Thanks so much for reporting the issue so we can get it fixed. |
gearman#163 if anyone just check the code, task->send.data , that send.data should come from libgearman/packet.cc line 90: packet->data= gearman_malloc(*packet->universal, arg_size); ( I am not 100% sure because it's been 20 days and I didn't log my analysis, I did my best to search code it looks familiar to me. ) and if you check where is the code to free packet->data , there is none. that's why I just add a free here.
#175
thank you.
Clint Byrum <notifications@github.com> 于2018年9月17日周一 下午9:56写道:
… Hi! Please submit a PR with your fix. I know it's just a one-liner, but
I'd like to have your name on the commits and your analysis in the commit
message. Thanks so much for reporting the issue so we can get it fixed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#163 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AV0KyEH-Oz1-WjDyt85gTRz4COGbxF0Jks5ucH0GgaJpZM4Vyj8D>
.
|
I used naemon and mod_gearman/gearmand to check about 40k services. The memory usage on naemon process ramp up from 400M to 80G in 3 days. I used every thing latest gearmand-1.1.17 and mod_gearman-3.0.6 and naemon-1.0.8.
I used valgrind to pinpoint the leak to be in libgearman which the code is part of gearmand. naemon process will dynamic load mod_gearman which caused this serious leak.
I fixed the bug in gearmand-1.1.17/libgearman/add.cc line 325
free((void *)(task->send.data));
My C language knowledge still stays at 15 years ago, I may not have fixed it in the best way, so please fix it in the later release in the way you think is the best.
The text was updated successfully, but these errors were encountered: