Not receiving performance graphs
Posted: Thu Apr 20, 2017 10:20 am
Good afternoon!
After a reboot of the Nagios XI server we are no longer able to see any data on our performance graphs. It says "No data to display". I have tried several troubleshooting steps myself and found several issues:
- The system time was set to 11 may 2017. I used "ntpdate <ip address>" to sync the time with our domain controller.
- The directory /var/lock/mrtg was missing. I've manually recreated it and the current permissions are set as follows:
Some log file checks (where you can clearly see the time jump to 11-05-2017):
- Then I found that the npcd service cannot read the PID file:
The PID file does exist with the following permissions set:
I tried adjusting the permissions to nagios nagios but after a npcd service restart it automatically sets it back to root root so I'm guesing that is as it should be. So the current problem seems to have something to do with the PID file and are most likely caused by the time sync issues. I've tried restarting our SQL, Nagios, NDO2DB, NPCD & HTTPD services. I also just noticed that we cannot perform actions such as "apply configuration". We get an error message when using "force an immediate check" saying that the request was not processed in a timely manner. However, the check is still being executed instantly.
Any suggestions?
Kind regards,
Dennis Lans
After a reboot of the Nagios XI server we are no longer able to see any data on our performance graphs. It says "No data to display". I have tried several troubleshooting steps myself and found several issues:
- The system time was set to 11 may 2017. I used "ntpdate <ip address>" to sync the time with our domain controller.
Code: Select all
20 Apr 17:19:00 ntpdate[8244]: step time server x.x.x.x offset -1991168.028776 sec
Code: Select all
ll /var/lock/
drwxr-xr-x 2 root root 40 Apr 20 13:50 mrtgCode: Select all
# tail -75 /usr/local/nagios/var/npcd.log
[03-26-2017 14:20:19] NPCD: Caught Termination Signal - Hasta la vista... baby
[03-26-2017 16:37:42] NPCD: npcd Daemon (0.4.14) started with PID=1006
[03-26-2017 16:37:42] NPCD: Please have a look at 'npcd -V' to get license information
[03-26-2017 16:37:42] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-18-2017 17:51:14] NPCD: Caught Termination Signal - Hasta la vista... baby
[05-11-2017 18:58:01] NPCD: npcd Daemon (0.4.14) started with PID=1002
[05-11-2017 18:58:01] NPCD: Please have a look at 'npcd -V' to get license information
[05-11-2017 18:58:01] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-20-2017 16:09:52] NPCD: Caught Termination Signal - Hasta la vista... baby
[04-20-2017 16:09:52] NPCD: npcd Daemon (0.4.14) started with PID=21932
[04-20-2017 16:09:52] NPCD: Please have a look at 'npcd -V' to get license information
[04-20-2017 16:09:52] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-20-2017 16:42:01] NPCD: Caught Termination Signal - Hasta la vista... baby
[04-20-2017 16:42:01] NPCD: npcd Daemon (0.4.14) started with PID=61062
[04-20-2017 16:42:01] NPCD: Please have a look at 'npcd -V' to get license information
[04-20-2017 16:42:01] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-20-2017 16:42:13] NPCD: Caught Termination Signal - Hasta la vista... baby
[04-20-2017 16:42:13] NPCD: npcd Daemon (0.4.14) started with PID=61244
[04-20-2017 16:42:13] NPCD: Please have a look at 'npcd -V' to get license information
[04-20-2017 16:42:13] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-20-2017 16:46:26] NPCD: Caught Termination Signal - Hasta la vista... baby
[04-20-2017 16:46:26] NPCD: npcd Daemon (0.4.14) started with PID=1411
[04-20-2017 16:46:26] NPCD: Please have a look at 'npcd -V' to get license information
[04-20-2017 16:46:26] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-20-2017 16:48:21] NPCD: npcd Daemon (0.4.14) started with PID=3348
[04-20-2017 16:48:21] NPCD: Please have a look at 'npcd -V' to get license information
[04-20-2017 16:48:21] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-20-2017 16:59:37] NPCD: Caught Termination Signal - Hasta la vista... baby
[04-20-2017 16:59:37] NPCD: npcd Daemon (0.4.14) started with PID=18070
[04-20-2017 16:59:37] NPCD: Please have a look at 'npcd -V' to get license information
[04-20-2017 16:59:37] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-20-2017 17:01:45] NPCD: Caught Termination Signal - Hasta la vista... baby
[04-20-2017 17:01:45] NPCD: npcd Daemon (0.4.14) started with PID=20506
[04-20-2017 17:01:45] NPCD: Please have a look at 'npcd -V' to get license information
[04-20-2017 17:01:45] NPCD: HINT: load_threshold is enabled - ('10.000000')
Code: Select all
# tail -f /usr/local/nagios/var/perfdata.log
2016-06-20 17:25:43 [59268] [0] *** TIMEOUT: Please check your npcd.cfg
2016-06-20 17:25:43 [59268] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1466436330.perfdata.host-PID-59268 deleted
2016-06-20 17:25:43 [59268] [0] *** Timeout while processing Host: "nloospr1.dbgroup.local" Service: "_HOST_"
2016-06-20 17:25:43 [59268] [0] *** process_perfdata.pl terminated on signal ALRM
2016-06-21 19:12:01 [20078] [0] *** TIMEOUT: Timeout after 5 secs. ***
2016-06-21 19:12:01 [20078] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2016-06-21 19:12:01 [20078] [0] *** TIMEOUT: Please check your npcd.cfg
2016-06-21 19:12:01 [20078] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1466529105.perfdata.service-PID-20078 deleted
2016-06-21 19:12:01 [20078] [0] *** Timeout while processing Host: "nloosvmm.dbgroup.local" Service: "F_Schijf"
2016-06-21 19:12:01 [20078] [0] *** process_perfdata.pl terminated on signal ALRM
- Then I found that the npcd service cannot read the PID file:
Code: Select all
# systemctl status npcd
● npcd.service - SYSV: Visit the Website at http://sourceforge.net/projects/pnp4nagios/
Loaded: loaded (/etc/rc.d/init.d/npcd; bad; vendor preset: disabled)
Active: active (running) since Thu 2017-04-20 16:48:21 CEST; 14s ago
Docs: man:systemd-sysv-generator(8)
Process: 1401 ExecStop=/etc/rc.d/init.d/npcd stop (code=exited, status=0/SUCCESS)
Process: 3345 ExecStart=/etc/rc.d/init.d/npcd start (code=exited, status=0/SUCCESS)
Main PID: 3348 (npcd)
CGroup: /system.slice/npcd.service
├─1411 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
└─3348 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
Apr 20 16:48:21 nlgrpngs.dbgroup.local systemd[1]: Starting SYSV: Visit the Website at http://sourceforge.net/projects/pnp4nagios/...
Apr 20 16:48:21 nlgrpngs.dbgroup.local npcd[3345]: NPCD started.
Apr 20 16:48:21 nlgrpngs.dbgroup.local systemd[1]: Failed to read PID from file /usr/local/nagiosxi/var/subsys/npcd.pid: Invalid argument
Apr 20 16:48:21 nlgrpngs.dbgroup.local systemd[1]: Started SYSV: Visit the Website at http://sourceforge.net/projects/pnp4nagios/.
Code: Select all
]# ll /usr/local/nagiosxi/var/subsys/
total 8
-rw-r--r-- 1 root root 0 Apr 20 15:42 nagios
-rw-r--r-- 1 nagios nagios 0 Dec 11 15:14 ndo2db
-rw-r--r-- 1 root root 4 Apr 20 16:48 npcd.pid
Any suggestions?
Kind regards,
Dennis Lans