So it seems every time I'm on a holiday something is going wrong with the performance graphs, see attached screenshot. This is very annoying, as this has an impact om data such as averages etc. Please advise me how to
1) detect the reason why our graphs stop working sometimes
2) let me know how I could best monitor this, so in case this issue re-appears we can act asap
The issue was solved both times by rebooting the Nagios server.
cat /proc/`cat /usr/local/nagios/var/nagios.lock`/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 10485760 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 86983 86983 processes
Max open files 4096 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 86983 86983 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Create some localhost services to monitor the important bits. I created these feature requests which I think would be useful in an XI system, the requests explain how to setup the services:
I suspect your npcd load limit is being reached and npcd is stopping. If it is, it should have been logged in /usr/local/nagios/var/npcd.log or /usr/local/nagios/var/perfdata.log
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
tail -250 /usr/local/nagios/var/perfdata.log
2015-07-27 15:00:10 [8786] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:00:10 [8786] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:00:10 [8786] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:00:10 [8786] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438001987.perfdata.service-PID-8786 deleted
2015-07-27 15:00:10 [8786] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:00:35 [9595] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:00:35 [9595] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:00:35 [9595] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:00:35 [9595] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002003.perfdata.service-PID-9595 deleted
2015-07-27 15:00:35 [9595] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:00:35 [9598] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:00:35 [9598] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:00:35 [9598] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:00:35 [9598] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002017.perfdata.service-PID-9598 deleted
2015-07-27 15:00:35 [9598] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:01:00 [10200] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:01:00 [10200] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:01:00 [10200] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:01:00 [10200] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002033.perfdata.service-PID-10200 deleted
2015-07-27 15:01:00 [10200] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:01:25 [10656] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:01:25 [10656] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:01:25 [10656] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:01:25 [10656] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002047.perfdata.service-PID-10656 deleted
2015-07-27 15:01:25 [10656] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:01:25 [10658] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:01:25 [10658] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:01:25 [10658] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:01:25 [10658] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002063.perfdata.service-PID-10658 deleted
2015-07-27 15:01:25 [10658] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:02:57 [13130] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:02:57 [13130] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:02:57 [13130] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:02:57 [13130] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002153.perfdata.service-PID-13130 deleted
2015-07-27 15:02:57 [13130] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:03:23 [14102] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:03:23 [14102] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:03:23 [14102] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:03:23 [14102] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002183.perfdata.service-PID-14102 deleted
2015-07-27 15:03:23 [14102] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:03:23 [14099] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:03:23 [14099] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:03:23 [14099] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:03:23 [14099] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002167.perfdata.service-PID-14099 deleted
2015-07-27 15:03:23 [14099] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:03:48 [14926] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:03:48 [14926] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:03:48 [14926] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:03:48 [14926] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002197.perfdata.service-PID-14926 deleted
2015-07-27 15:03:48 [14926] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:04:13 [15704] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:04:13 [15704] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:04:13 [15704] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:04:13 [15704] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002227.perfdata.service-PID-15704 deleted
2015-07-27 15:04:13 [15704] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:04:13 [15701] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:04:13 [15701] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:04:13 [15701] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:04:13 [15701] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002213.perfdata.service-PID-15701 deleted
2015-07-27 15:04:13 [15701] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:04:38 [16690] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:04:38 [16690] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:04:38 [16690] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:04:38 [16690] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002243.perfdata.service-PID-16690 deleted
2015-07-27 15:04:38 [16690] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:04:38 [16694] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:04:38 [16694] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:04:38 [16694] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:04:38 [16694] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002257.perfdata.service-PID-16694 deleted
2015-07-27 15:04:38 [16694] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:05:03 [17519] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:05:03 [17519] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:05:03 [17519] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:05:03 [17519] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002273.perfdata.service-PID-17519 deleted
2015-07-27 15:05:03 [17519] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:05:03 [17521] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:05:03 [17521] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:05:03 [17521] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:05:03 [17521] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002287.perfdata.service-PID-17521 deleted
2015-07-27 15:05:03 [17521] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:05:28 [18332] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:05:28 [18332] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:05:28 [18332] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:05:28 [18332] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002303.perfdata.service-PID-18332 deleted
2015-07-27 15:05:28 [18332] [0] *** process_perfdata.pl terminated on signal ALRM
2015-07-27 15:05:53 [18937] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-07-27 15:05:53 [18937] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-07-27 15:05:53 [18937] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-07-27 15:05:53 [18937] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438002317.perfdata.service-PID-18937 deleted
2015-07-27 15:05:53 [18937] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-01 00:16:25 [22064] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-01 00:16:25 [22064] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-01 00:16:25 [22064] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-01 00:16:25 [22064] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1438380952.perfdata.service-PID-22064 deleted
2015-08-01 00:16:25 [22064] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 08:51:18 [59148] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 08:51:18 [59147] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 08:51:18 [59148] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 08:51:18 [59147] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 08:51:18 [59147] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 08:51:18 [59148] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 08:51:18 [59147] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441003853.perfdata.service-PID-59147 deleted
2015-08-31 08:51:18 [59148] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441003853.perfdata.host-PID-59148 deleted
2015-08-31 08:51:18 [59147] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 08:51:18 [59148] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:35:05 [10482] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:35:05 [10482] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:35:05 [10482] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:35:05 [10482] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006480.perfdata.service-PID-10482 deleted
2015-08-31 09:35:05 [10482] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:35:30 [11422] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:35:30 [11422] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:35:30 [11422] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:35:30 [11422] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006510.perfdata.service-PID-11422 deleted
2015-08-31 09:35:30 [11422] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:35:30 [11420] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:35:30 [11420] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:35:30 [11420] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:35:30 [11420] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006496.perfdata.service-PID-11420 deleted
2015-08-31 09:35:30 [11420] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:35:55 [12211] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:35:55 [12211] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:35:55 [12211] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:35:55 [12211] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006540.perfdata.service-PID-12211 deleted
2015-08-31 09:35:55 [12211] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:35:55 [12208] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:35:55 [12208] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:35:55 [12208] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:35:55 [12208] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006526.perfdata.service-PID-12208 deleted
2015-08-31 09:35:55 [12208] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:36:20 [13278] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:36:20 [13278] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:36:20 [13278] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:36:20 [13278] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006556.perfdata.service-PID-13278 deleted
2015-08-31 09:36:20 [13278] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:36:45 [14225] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:36:45 [14225] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:36:45 [14225] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:36:45 [14225] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006586.perfdata.service-PID-14225 deleted
2015-08-31 09:36:45 [14225] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:36:45 [14222] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:36:45 [14222] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:36:45 [14222] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:36:45 [14222] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006570.perfdata.service-PID-14222 deleted
2015-08-31 09:36:45 [14222] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:38:26 [3532] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:38:26 [3532] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:38:26 [3532] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:38:26 [3532] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006684.perfdata.service-PID-3532 deleted
2015-08-31 09:38:26 [3532] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:38:51 [4727] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:38:51 [4727] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:38:51 [4727] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:38:51 [4727] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006699.perfdata.service-PID-4727 deleted
2015-08-31 09:38:51 [4727] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:39:16 [5570] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:39:16 [5570] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:39:16 [5570] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:39:16 [5570] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006729.perfdata.service-PID-5570 deleted
2015-08-31 09:39:16 [5570] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:39:16 [5566] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:39:16 [5566] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:39:16 [5566] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:39:16 [5566] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006714.perfdata.service-PID-5566 deleted
2015-08-31 09:39:16 [5566] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:39:42 [6551] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:39:42 [6551] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:39:42 [6551] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:39:42 [6551] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006744.perfdata.service-PID-6551 deleted
2015-08-31 09:39:42 [6551] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:39:42 [6554] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:39:42 [6554] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:39:42 [6554] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:39:42 [6554] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006759.perfdata.service-PID-6554 deleted
2015-08-31 09:39:42 [6554] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:40:07 [7446] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:40:07 [7446] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:40:07 [7446] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:40:07 [7446] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006774.perfdata.service-PID-7446 deleted
2015-08-31 09:40:07 [7446] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:40:07 [7449] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:40:07 [7449] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:40:07 [7449] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:40:07 [7449] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006790.perfdata.service-PID-7449 deleted
2015-08-31 09:40:07 [7449] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:40:32 [8601] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:40:32 [8601] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:40:32 [8601] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:40:32 [8601] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006804.perfdata.service-PID-8601 deleted
2015-08-31 09:40:32 [8601] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:40:32 [8604] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:40:32 [8604] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:40:32 [8604] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:40:32 [8604] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006820.perfdata.service-PID-8604 deleted
2015-08-31 09:40:32 [8604] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:40:57 [9781] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:40:57 [9781] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:40:57 [9781] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:40:57 [9781] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006834.perfdata.service-PID-9781 deleted
2015-08-31 09:40:57 [9781] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:41:22 [11037] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:41:22 [11037] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:41:22 [11037] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:41:22 [11037] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006850.perfdata.service-PID-11037 deleted
2015-08-31 09:41:22 [11037] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:41:22 [11040] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:41:22 [11040] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:41:22 [11040] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:41:22 [11040] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006864.perfdata.service-PID-11040 deleted
2015-08-31 09:41:22 [11040] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:41:47 [11728] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:41:47 [11728] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:41:47 [11728] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:41:47 [11728] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006880.perfdata.service-PID-11728 deleted
2015-08-31 09:41:47 [11728] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:42:12 [12342] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:42:12 [12342] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:42:12 [12342] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:42:12 [12342] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006899.perfdata.service-PID-12342 deleted
2015-08-31 09:42:12 [12342] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:42:59 [13445] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:42:59 [13445] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:42:59 [13445] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:42:59 [13445] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441006959.perfdata.service-PID-13445 deleted
2015-08-31 09:42:59 [13445] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:43:48 [15247] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:43:48 [15247] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:43:48 [15247] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:43:48 [15247] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441007005.perfdata.service-PID-15247 deleted
2015-08-31 09:43:48 [15247] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:44:13 [16112] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:44:13 [16112] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:44:13 [16112] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:44:13 [16112] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441007035.perfdata.service-PID-16112 deleted
2015-08-31 09:44:13 [16112] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:44:13 [16111] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:44:13 [16111] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:44:13 [16111] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:44:13 [16111] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441007019.perfdata.service-PID-16111 deleted
2015-08-31 09:44:13 [16111] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:44:38 [16946] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:44:38 [16946] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:44:38 [16946] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:44:38 [16946] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441007049.perfdata.service-PID-16946 deleted
2015-08-31 09:44:38 [16946] [0] *** process_perfdata.pl terminated on signal ALRM
2015-08-31 09:45:24 [18622] [0] *** TIMEOUT: Timeout after 10 Sec. ****
2015-08-31 09:45:24 [18622] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-08-31 09:45:24 [18622] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-08-31 09:45:24 [18622] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1441007109.perfdata.service-PID-18622 deleted
2015-08-31 09:45:24 [18622] [0] *** process_perfdata.pl terminated on signal ALRM
Does these logs confirm your suspicion that my npcd load limit is reached? I also found that the RAM / Swap Used during the periods I'm experiencing issues seems to suddenly rise until the server is rebooted. The question is why this only happens during my holiday's. I'll monitor this more closely in the coming weeks. i'm not seeing high cpu spikes during the issue.
Is the NPCD load limit memory or cpu related? Where can I change this?
Grtz
Willem
You do not have the required permissions to view the files attached to this post.
I changed the load threshold and the timeout. Please leave this thread open for a while. Let's hope this issue is solved now. I implemented the tips of Troy also, so I should be able to act more quickly when the issue would re-appear.
I changed the load threshold and the timeout. Please leave this thread open for a while. Let's hope this issue is solved now. I implemented the tips of Troy also, so I should be able to act more quickly when the issue would re-appear.
First of all best wishes to all of the Nagios support team for 2016!
Again during my holiday our graphs stopped working. Aparently the things I changed to prevent this didn't solve the problem.
I have been troubleshooting this a bit and it seems the Nagios process has some sort of memory leak, causing it to continue using more and more % memory of the total available memory untill it starts causing issues. Please check my attached screenshot. This is a big problem imho, as it causes blanks in our graphs. Check the attached screenshot and please pay close attention to the memory usage of the Nagios process. The server was rebooted 28/10 and started having issues with graphs 03/01 10:00 untill Monday morning where I rebooted the server.
After a reboot the Nagios process memory usage is stable for some time and then starts using more and more memory, as is clearly visible in the 7 days graph. The moment the process start leaking memory again the last time was after work hours, so noone was doing anything in the gui. Every 1:45 the nagios processes consume 1 % more memory.
Please let me know how I can further troubleshoot and solve this memory leak and graphing problem.
Willem
You do not have the required permissions to view the files attached to this post.