Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sysconfig limits and settings are not respected #6215

Closed
akqopensystems opened this issue Apr 9, 2018 · 18 comments · Fixed by #6241
Closed

Sysconfig limits and settings are not respected #6215

akqopensystems opened this issue Apr 9, 2018 · 18 comments · Fixed by #6241
Assignees
Labels
area/configuration DSL, parser, compiler, error handling area/setup Installation, systemd, sample files bug Something isn't working
Milestone

Comments

@akqopensystems
Copy link

akqopensystems commented Apr 9, 2018

Expected Behavior

On RHEL/CentOS7, process limits as seen by systemctl show and cat /proc/PID/limits should provide consistent information.

Current Behavior

On RHEL7:

[root@osshplpmo06 ~]# systemctl cat icinga2 
# /usr/lib/systemd/system/icinga2.service
[...]
# /etc/systemd/system/icinga2.service.d/limits.conf
[Service]
LimitNOFILE=50000
[...]
[root@osshplpmo06 ~]# systemctl show -p LimitNOFILE icinga2
LimitNOFILE=50000
[root@osshplpmo06 ~]# cat /proc/55402/limits |grep "Max open"
Max open files            16384                16384                files

This is irritating, as the system administrator can't easily determine the limits in force for Icinga2. We've already had a discussion with RH support about this and they think that this may be related to the "--no-stack-rlimit" option passed on the command line.

Possible Solution

If Icinga2 sets its own limits (to 16384), this should be explicitly documented. Better, to make this setting configurable by the user.

Steps to Reproduce (for bugs)

  1. Create limits file for service in /etc/systemd/system//cinga2.service.d/limits.conf:
[Service]
LimitNOFILE=50000
  1. systemctl daemon-reload && systemctl restart icinga2.service
  2. systemctl show -p LimitNOFILE icinga2.service
  3. cat /proc/Icinga2-PID/limits

Context

At the moment we are looking into an issue with checks that sometimes forward no performance metrics for Graphite. In order to rule out resource exhaustion, we are checking the configured limits of the Icinga2 processes.

Your Environment

  • Version used (icinga2 --version):
    icinga2 - The Icinga 2 network monitoring daemon (version: r2.8.2-1)

Copyright (c) 2012-2017 Icinga Development Team (https://www.icinga.com/)
License GPLv2+: GNU GPL version 2 or later http://gnu.org/licenses/gpl2.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Application information:
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid

System information:
Platform: Red Hat Enterprise Linux Server
Platform version: 7.4 (Maipo)
Kernel: Linux
Kernel version: 3.10.0-693.17.1.el7.x86_64
Architecture: x86_64

Build information:
Compiler: GNU 4.8.5
Build host: unknown

  • Enabled features (icinga2 feature list):
    Disabled features: command compatlog debuglog elasticsearch gelf graphite influxdb livestatus notification opentsdb perfdata statusdata syslog
    Enabled features: api checker mainlog

  • Config validation (icinga2 daemon -C):
    information/cli: Icinga application loader (version: r2.8.2-1)
    information/cli: Loading configuration file(s).
    information/ConfigItem: Committing config item(s).
    information/ApiListener: My API identity: osshplpmo06.xxxx.de
    warning/ApplyRule: Apply rule 'uptime' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 1:0-1:21) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'bacula_file_daemon' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 139:1-139:34) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'agent_icinga-core' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 148:1-148:33) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'available_volume_space_all' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 156:1-156:42) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'load' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 164:1-164:20) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'cluster-zone' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 172:1-172:28) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'icinga2_ido' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 180:1-180:27) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'cpu' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 253:1-253:19) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'ram' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 262:1-262:19) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'hardware-health' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 329:1-329:31) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'uptime' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 338:1-338:22) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'cpu' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 349:1-349:19) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'ram' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 360:1-360:19) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'interface_ethernet0/0_usage' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 371:1-371:43) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'interface_ethernet0/0_errors' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 384:1-384:44) for type 'Service' does not match anywhere!
    warning/ApplyRule: Apply rule 'interface_ethernet0/0_status' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 397:1-397:44) for type 'Service' does not match anywhere!
    information/ConfigItem: Instantiated 1 ApiListener.
    information/ConfigItem: Instantiated 4 Zones.
    information/ConfigItem: Instantiated 3 Endpoints.
    information/ConfigItem: Instantiated 1 FileLogger.
    information/ConfigItem: Instantiated 223 CheckCommands.
    information/ConfigItem: Instantiated 1 IcingaApplication.
    information/ConfigItem: Instantiated 1353 Hosts.
    information/ConfigItem: Instantiated 600 HostGroups.
    information/ConfigItem: Instantiated 2 Downtimes.
    information/ConfigItem: Instantiated 2 ServiceGroups.
    information/ConfigItem: Instantiated 5681 Services.
    information/ConfigItem: Instantiated 1 CheckerComponent.
    information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
    information/cli: Finished validating the configuration file(s).

  • If you run multiple Icinga 2 instances, the zones.conf file (or icinga2 object list --type Endpoint and icinga2 object list --type Zone) from all affected nodes.
    Object 'osshplpmo06.xxxx.de' of type 'Endpoint':
    % declared in '/etc/icinga2/zones.conf', lines 28:1-28:41

    • __name = "osshplpmo06.xxxx.de"
    • host = ""
    • log_duration = 86400
    • name = "osshplpmo06.xxxx.de"
    • package = "_etc"
    • port = "5665"
    • source_location
      • first_column = 1
      • first_line = 28
      • last_column = 41
      • last_line = 28
      • path = "/etc/icinga2/zones.conf"
    • templates = [ "osshplpmo06.xxxx.de" ]
      % = modified in '/etc/icinga2/zones.conf', lines 28:1-28:41
    • type = "Endpoint"
    • zone = ""

Object 'ossnplpmo03.xxxx.de' of type 'Endpoint':
% declared in '/etc/icinga2/zones.conf', lines 11:1-11:41

  • __name = "ossnplpmo03.xxxx.de"
  • host = "46.254.124.61"
    % = modified in '/etc/icinga2/zones.conf', lines 12:9-12:30
  • log_duration = 86400
  • name = "ossnplpmo03.xxxx.de"
  • package = "_etc"
  • port = "5665"
    % = modified in '/etc/icinga2/zones.conf', lines 13:9-13:21
  • source_location
    • first_column = 1
    • first_line = 11
    • last_column = 41
    • last_line = 11
    • path = "/etc/icinga2/zones.conf"
  • templates = [ "ossnplpmo03.xxxx.de" ]
    % = modified in '/etc/icinga2/zones.conf', lines 11:1-11:41
  • type = "Endpoint"
  • zone = ""

Object 'osszplpmo02.xxxx.de' of type 'Endpoint':
% declared in '/etc/icinga2/zones.conf', lines 6:1-6:41

  • __name = "osszplpmo02.xxxx.de"
  • host = "46.254.124.37"
    % = modified in '/etc/icinga2/zones.conf', lines 7:2-7:23
  • log_duration = 86400
  • name = "osszplpmo02.xxxx.de"
  • package = "_etc"
  • port = "5665"
    % = modified in '/etc/icinga2/zones.conf', lines 8:2-8:14
  • source_location
    • first_column = 1
    • first_line = 6
    • last_column = 41
    • last_line = 6
    • path = "/etc/icinga2/zones.conf"
  • templates = [ "osszplpmo02.xxxx.de" ]
    % = modified in '/etc/icinga2/zones.conf', lines 6:1-6:41
  • type = "Endpoint"
  • zone = ""
@dnsmichi
Copy link
Contributor

dnsmichi commented Apr 9, 2018

You can set these limits in the sysconfig file. See the "advanced" table in this chapter: https://www.icinga.com/docs/icinga2/latest/doc/17-language-reference/#constants

@dnsmichi dnsmichi added area/configuration DSL, parser, compiler, error handling needs feedback We'll only proceed once we hear from you again labels Apr 9, 2018
@Crunsher Crunsher added the area/setup Installation, systemd, sample files label Apr 10, 2018
@Crunsher
Copy link
Contributor

Our way of doing this may not be standard, for RHEL specific changes to init scripts and the sort please see https://github.com/Icinga/rpm-icinga2

@akqopensystems
Copy link
Author

akqopensystems commented Apr 10, 2018

Thanks for the clarification! I think these options would be better documented in the configuration chapter, maybe in a topic "Advanced configuration": https://www.icinga.com/docs/icinga2/latest/doc/04-configuring-icinga-2/
Unfortunately, this seems not to work as expected. On a test system with only slight differences to production:

[root@ossztlvmo12 icinga2]# cat /etc/sysconfig/icinga2 
DAEMON=/usr/sbin/icinga2
ICINGA2_CONFIG_FILE=/etc/icinga2/icinga2.conf
ICINGA2_RUN_DIR=/run
ICINGA2_STATE_DIR=/var
ICINGA2_PID_FILE=$ICINGA2_RUN_DIR/icinga2/icinga2.pid
ICINGA2_LOG_DIR=/var/log/icinga2
ICINGA2_ERROR_LOG=$ICINGA2_LOG_DIR/error.log
ICINGA2_STARTUP_LOG=$ICINGA2_LOG_DIR/startup.log
ICINGA2_LOG=$ICINGA2_LOG_DIR/icinga2.log
ICINGA2_CACHE_DIR=$ICINGA2_STATE_DIR/cache/icinga2
ICINGA2_USER=icinga
ICINGA2_GROUP=icinga
ICINGA2_COMMAND_GROUP=icingacmd
ICINGA2_RLIMIT_FILES=50000
ICINGA2_RLIMIT_PROCESSES=62883
[root@ossztlvmo12 icinga2]# systemctl cat icinga2
# /usr/lib/systemd/system/icinga2.service
[Unit]
Description=Icinga host/service/network monitoring system
After=syslog.target network-online.target postgresql.service mariadb.service carbon-cache.service carbon-relay.service

[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/icinga2
ExecStartPre=/usr/lib/icinga2/prepare-dirs /etc/sysconfig/icinga2
ExecStart=/usr/sbin/icinga2 daemon -d -e ${ICINGA2_ERROR_LOG}
PIDFile=/run/icinga2/icinga2.pid
ExecReload=/usr/lib/icinga2/safe-reload /etc/sysconfig/icinga2
TimeoutStartSec=30m

# Systemd >228 enforces a lower process number for services.
# Depending on the distribution and Systemd version, this must
# be explicitly raised. Packages will set the needed values
# into /etc/systemd/system/icinga2.service.d/limits.conf
#
# Please check the troubleshooting documentation for further details.
# The values below can be used as examples for customized service files.

#TasksMax=infinity
#LimitNPROC=62883

[Install]
WantedBy=multi-user.target
[root@ossztlvmo12 icinga2]# systemctl show icinga2
[...]
LimitCPU=18446744073709551615
LimitFSIZE=18446744073709551615
LimitDATA=18446744073709551615
LimitSTACK=18446744073709551615
LimitCORE=18446744073709551615
LimitRSS=18446744073709551615
LimitNOFILE=4096
[...]
root@ossztlvmo12 icinga2]# systemctl stop icinga2
[root@ossztlvmo12 icinga2]# systemctl start icinga2
[root@ossztlvmo12 icinga2]# ps -ef |grep icinga2|grep -v plugin
icinga   14980     1  0 12:51 ?        00:00:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e $ICINGA2_LOG_DIR/error.log
icinga   14985     1 66 12:51 ?        00:00:04 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e $ICINGA2_LOG_DIR/error.log
[root@ossztlvmo12 icinga2]# cat /proc/14980/limits |grep "Max open"
Max open files            16384                16384                files     
[root@ossztlvmo12 icinga2]# cat /proc/14985/limits |grep "Max open"
Max open files            16384                16384                files

We are using the Icinga2 rpm repository.

@dnsmichi
Copy link
Contributor

You cannot go lower than the default of 16k open files. That is a sane default what Icinga 2 requires at minimum to run and work.

Read-write. Defines the resource limit for RLIMIT_NOFILE that should be set at start-up. Value cannot be set lower than the default 16 * 1024. 0 disables the setting. Set in Icinga 2 sysconfig.

@Mikesch-mp
Copy link
Contributor

He is refering to ICINGA2_RLIMIT_FILES=50000 set in /etc/sysconfig/icinga2. It should raise the max open files to 50k but it is still at 16k, so icinga2 is ignoring it. I have the same problem on SLES, icinga2 ignores the settings. Even in older versions (2.7.2) it does not work to set RLimitFiles in init.conf.

@akqopensystems
Copy link
Author

Thanks, @Mikesch-mp . Yes, the problem is that we want to increase the maximum number of open files to 50000, but the icinga2 processes ignore this change and stay at the default of 16 * 1024.

@dnsmichi
Copy link
Contributor

Ah ok, thanks, shouldn't comment here when I am tired after giving a training. Then I am out of ideas and one needs to reproduce the problem.

@akqopensystems
Copy link
Author

At the moment, our checks are getting more and more late due to the fixed ICINGA2_RLIMIT_FILES. On all checking systems, we are running into the file limit from time to time with service checks getting late as much as 10 minutes (at 5 minutes schedule).
grafik
The check above was eventually executed with 9 minutes delay. Also, the icinga2 graphite writer module is not able to send the performance metrics to Graphite in this situation:

cat /var/log/icinga2/icinga2.log
[2018-04-12 16:54:50 +0200] critical/GraphiteWriter: Cannot write to TCP socket on host '127.0.0.1' port '2013'.
[2018-04-12 16:55:00 +0200] critical/GraphiteWriter: Cannot write to TCP socket on host '127.0.0.1' port '2013'.
[2018-04-12 16:55:09 +0200] critical/GraphiteWriter: Cannot write to TCP socket on host '127.0.0.1' port '2013'.
[2018-04-12 16:55:20 +0200] critical/GraphiteWriter: Cannot write to TCP socket on host '127.0.0.1' port '2013'.

This leads to large gaps in the Graphite performance graphs:
grafik

Here's a sample number of open files from an icinga2 satellite at the time the checks are late:

[root@xxxxxmo03 ~]# lsof |grep -c icinga
17771

@dnsmichi
Copy link
Contributor

Confirmed, it is a bug. Tested inside the Icinga Vagrant box standalone.

[root@icinga2 ~]# grep -ri files /etc/sysconfig/icinga2
ICINGA2_RLIMIT_FILES=50000

[root@icinga2 ~]# systemctl restart icinga2

[root@icinga2 ~]# for p in $(pidof icinga2); do cat /proc/$p/limits | grep "Max open"; done
Max open files            16384                16384                files
Max open files            16384                16384                files

[root@icinga2 ~]# icinga2 console --connect 'https://root:icinga@localhost:5665/' --eval 'RLimitFiles'
16384.0

@dnsmichi dnsmichi added bug Something isn't working and removed needs feedback We'll only proceed once we hear from you again labels Apr 13, 2018
@pogii123
Copy link

It seems that all changes that are done in /etc/sysconfig/icinga2 dont take affect. Even if I put random characters in there.. Nothing works.

[root@icinga2 ]# grep -i user /etc/sysconfig/icinga2
ICINGA2_USER=ici12345nga
[root@icinga2 ]# systemctl restart icinga2
[root@icinga2 ]# icinga2 variable get RunAsUser
icinga
[root@icinga2 ]# ps -ef | grep icinga2
icinga   13628     1  0 17:47 ?        00:00:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e $ICINGA2_LOG_DIR/error.log
icinga   13633     1  0 17:47 ?        00:00:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e $ICINGA2_LOG_DIR/error.log

@Crunsher
Copy link
Contributor

We narrowed down the bug to the file not being read at all, i.e. the path in some instances is not set.

@dnsmichi dnsmichi self-assigned this Apr 19, 2018
@dnsmichi dnsmichi added this to the 2.8.3 milestone Apr 19, 2018
@dnsmichi dnsmichi changed the title Process limits for icinga2 processes on RHEL Sysconfig limits and settings are not respected Apr 19, 2018
@dnsmichi
Copy link
Contributor

The path is compiled into the binary, and under specific circumstances an empty string. We've observed this with builds inside Docker and other variants. The patch requires a new tagged release including proper tests.

dnsmichi pushed a commit to Icinga/deb-icinga2 that referenced this issue Apr 19, 2018
dnsmichi pushed a commit to Icinga/rpm-icinga2 that referenced this issue Apr 19, 2018
@dnsmichi
Copy link
Contributor

The referenced package fixes are not part of this ticket, only topic related for 2.8.3.

@dnsmichi
Copy link
Contributor

dnsmichi commented Apr 19, 2018

@Crunsher I've cherry-picked 11853cb36339920729bcaa5fcd461b5a288ba4cb for better logging into the coming PR for this issue. feature/rlimit-errno can be deleted.

@dnsmichi
Copy link
Contributor

mbmif /usr/local/icinga2 (master *) # icinga2 daemon
[2018-04-19 10:07:12 +0200] warning/icinga-app: Sysconfig file '/usr/local/icinga2/etc/sysconfig/icinga2' cannot be read. Using default values.
[2018-04-19 10:07:12 +0200] information/cli: Icinga application loader (version: v2.8.2-637-g081988a0d; debug)

dnsmichi pushed a commit that referenced this issue Apr 19, 2018
It may happen that the variable is not properly initialized
and we'll have an empty string. Observed on macOS and inside
Docker.

refs #6215
dnsmichi pushed a commit that referenced this issue Apr 19, 2018
dnsmichi pushed a commit that referenced this issue Apr 19, 2018
dnsmichi pushed a commit that referenced this issue Apr 19, 2018
dnsmichi pushed a commit that referenced this issue Apr 19, 2018
It may happen that the variable is not properly initialized
and we'll have an empty string. Observed on macOS and inside
Docker.

refs #6215

refs #6241
dnsmichi pushed a commit that referenced this issue Apr 19, 2018
dnsmichi pushed a commit that referenced this issue Apr 19, 2018
@dnsmichi
Copy link
Contributor

[root@icinga2-elastic ~]# vim /etc/sysconfig/icinga2
[root@icinga2-elastic ~]# systemctl restart icinga2
[root@icinga2-elastic ~]# for p in $(pidof icinga2); do cat /proc/$p/limits | grep "Max open"; done
Max open files            50000                50000                files
Max open files            50000                50000                files
[root@icinga2-elastic ~]# icinga2 console --connect 'https://root:icinga@localhost:5665/' --eval 'RLimitFiles'
50000.0

@sebastic
Copy link
Contributor

sebastic commented May 15, 2018

/etc/sysconfig is not applicable for the Debian family (which uses /etc/default for init script variables), the warnings cause users to file bugs like: Debian Bug #898703.

Ideally the sysconfig directory is not checked for the Debian family, or /etc/default is checked instead.

@dnsmichi
Copy link
Contributor

We're dealing with this in #6255 scheduled for CW 21.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/configuration DSL, parser, compiler, error handling area/setup Installation, systemd, sample files bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants