-
Notifications
You must be signed in to change notification settings - Fork 23
Grafana reports Fluentd memory leak #85
Comments
Hi, Environment:
We decided to set a memory limit for fluentd but in fact, it just makes fluentd to be killed by OOM more frequently. Unfortunately, after that we encounter another issue. After each OOM kill of the Flutentd container there was a very high cpu/io spike (~50k iops) on the node. We didn't spot any suspicious activity near Flutentd containers or log pipeline, but we suspect that usage is growing much faster when Flutentd is logging non-ascii chars. Our only solution so far, is a graceful restart of flunetd docker containers each night. |
Trying to google around and see if there are any open issues against fluentd or the plugins we have installed that might be causing this problem but I am not having any luck. |
@jayjun is the fluentd pod that is having the issues the same node that is also hosting nsq? |
@jchauncey Nope, nsqd is in my other node. Here are all the containers in the same node as the errant fluentd,
|
Some updates. FYI, no one touched this cluster in the last 7 days. FluentdMemory leaks grow linearly for all Fluentd pods, not just one. My app was deployed on Feb 13, that's when slopes changed. So, it is somehow related to deployments. OthersI can't spot any correlation with other pods, except for Workflow Manager. Not sure if they're related. Memory usage also [oddly] flatlined since 2 days ago. I've gracefully restarted |
@mboersma There's a memory fix in Fluentd v0.14.13.
|
Excellent, certainly worth a try. I'll go revise #87. |
Thats awesome @mattk42 |
I think this can be closed now. |
Just a week in production with one Deis app. Memory usage grew exponentially from 100 MB to around 380 MB. Interestingly only one pod leaks.
Very low volume site, and certainly no runaway logs from my app. However, nsqd does log like a madman.
Created an issue so others may post similar findings.
The text was updated successfully, but these errors were encountered: