files hanging around in checkresults directory
Posted: Mon Aug 05, 2013 8:32 am
I am getting hundreds of checkresults files being left in /usr/local/nagios/var/spool/checkresults I changed the max_check_result_file_age to 600 hoping that would help but it hasn't. Right now I have over 800 files there the oldest of which is more than 48 hours. I end up having to stop nagios, delete all the files, then restart nagios every couple days. Interestingly my max open files is set to almost 800K.
max_check_result_file_age=600
max_check_result_reaper_time=30
[root@lpnagv03 log]# ps -ef |grep 24916
nagios 24916 1 1 Aug02 ? 00:55:06 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Aug 5 08:56:07 lpnagv03 rsyslogd-2177: imuxsock lost 106 messages from pid 24916 due to rate-limiting
Aug 5 08:56:59 lpnagv03 rsyslogd-2177: imuxsock begins to drop messages from pid 24916 due to rate-limiting
Aug 5 08:57:07 lpnagv03 rsyslogd-2177: imuxsock lost 62 messages from pid 24916 due to rate-limiting
Aug 5 08:57:59 lpnagv03 rsyslogd-2177: imuxsock begins to drop messages from pid 24916 due to rate-limiting
Aug 5 08:58:07 lpnagv03 rsyslogd-2177: imuxsock lost 457 messages from pid 24916 due to rate-limiting
Aug 5 08:59:12 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:17 lpnagv03 nagios: Error: Unable to create temp file for writing status data: Too many open files
Aug 5 08:59:17 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:22 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:27 lpnagv03 nagios: Error: Unable to create temp file for writing status data: Too many open files
Aug 5 08:59:27 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:32 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:37 lpnagv03 nagios: Error: Unable to create temp file for writing status data: Too many open files
Aug 5 08:59:37 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:42 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
-rw------- 1 nagios users 401 Aug 3 06:18 cqGN9TO
-rw------- 1 nagios users 0 Aug 3 06:18 cqGN9TO.ok
[root@lpnagv03 log]# ls -lt /usr/local/nagios/var/spool/checkresults |wc -l
811
[root@lpnagv03 log]# lsof |wc -l
7568
[root@lpnagv03 log]# sysctl fs.file-nr
fs.file-nr = 3200 0 793765
max_check_result_file_age=600
max_check_result_reaper_time=30
[root@lpnagv03 log]# ps -ef |grep 24916
nagios 24916 1 1 Aug02 ? 00:55:06 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Aug 5 08:56:07 lpnagv03 rsyslogd-2177: imuxsock lost 106 messages from pid 24916 due to rate-limiting
Aug 5 08:56:59 lpnagv03 rsyslogd-2177: imuxsock begins to drop messages from pid 24916 due to rate-limiting
Aug 5 08:57:07 lpnagv03 rsyslogd-2177: imuxsock lost 62 messages from pid 24916 due to rate-limiting
Aug 5 08:57:59 lpnagv03 rsyslogd-2177: imuxsock begins to drop messages from pid 24916 due to rate-limiting
Aug 5 08:58:07 lpnagv03 rsyslogd-2177: imuxsock lost 457 messages from pid 24916 due to rate-limiting
Aug 5 08:59:12 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:17 lpnagv03 nagios: Error: Unable to create temp file for writing status data: Too many open files
Aug 5 08:59:17 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:22 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:27 lpnagv03 nagios: Error: Unable to create temp file for writing status data: Too many open files
Aug 5 08:59:27 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:32 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:37 lpnagv03 nagios: Error: Unable to create temp file for writing status data: Too many open files
Aug 5 08:59:37 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
Aug 5 08:59:42 lpnagv03 nagios: Error: Could not open check result queue directory '/usr/local/nagios/var/spool/checkresults' for reading.
-rw------- 1 nagios users 401 Aug 3 06:18 cqGN9TO
-rw------- 1 nagios users 0 Aug 3 06:18 cqGN9TO.ok
[root@lpnagv03 log]# ls -lt /usr/local/nagios/var/spool/checkresults |wc -l
811
[root@lpnagv03 log]# lsof |wc -l
7568
[root@lpnagv03 log]# sysctl fs.file-nr
fs.file-nr = 3200 0 793765