-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[201803][monit] Restart rsyslog service if rsyslogd consumes > 800 MB memory #2963
Conversation
@@ -268,6 +268,10 @@ check system $HOST | |||
if memory usage > 50% for 5 times within 10 cycles then alert | |||
if cpu usage (user) > 90% for 5 times within 10 cycles then alert | |||
if cpu usage (system) > 90% for 5 times within 10 cycles then alert | |||
check process rsyslog with pidfile /var/run/rsyslogd.pid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/var/run/rsyslogd.pid [](start = 35, length = 21)
How about the rsyslog processes inside docker? Do they matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have not seen the rsyslogd memory leak occur on a rsyslogd process inside any Docker container. The assumption is that those rsyslogd processes have a very light load, whereas the rsyslogd process in the host image also acts as the rsyslog server for all of those processes, so it handles a much higher load of messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rsyslog within container is better managed within the container, for example use superlance with supervisord or use container option to limit the whole memory consumption for the container.
check process rsyslog with pidfile /var/run/rsyslogd.pid | ||
start program = "/bin/systemctl start rsyslog.service" | ||
stop program = "/bin/systemctl stop rsyslog.service" | ||
if totalmem > 800 MB for 5 times within 10 cycles then restart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restart [](start = 57, length = 7)
Do we need to keep a restart counter somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question, what is the log message for such restart. we can search the syslog for such cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each cycle that monit detects the memory has exceeded the threshold it will log the following:
ERR monit[607]: 'rsyslog' total mem amount of 1.6 GB matches resource limit [total mem amount>800.0 MB]
And if it meets the criteria (5 of these within 10 cycles), it will log the following when it attempts to restart the service:
INFO monit[607]: 'rsyslog' trying to restart
Get this change in 201811 branch until we have a better memory resource monitor/mitigation in place. |
Configure monit to monitor the resident memory consumption of rsyslogd. If memory usage is > 800 MB for 5 out of 10 checks (2-minute cycle interval, so 10 out of 20 minutes), restart the rsyslog service, because rsyslogd is most likely leaking memory.