Page 1 of 1

Perfdata not working anymore

Posted: Fri Nov 15, 2013 3:31 am
by WillemDH
Hello,

After my issue after the reboot yesterday, it seems our graphs don't get any data anymore!

tail -25 /usr/local/nagios/var/perfdata.log

2013-11-14 07:00:32 [23456] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-11-14 07:00:32 [23456] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-11-14 07:00:32 [23456] [0] *** TIMEOUT: Please check your npcd.cfg
2013-11-14 07:00:32 [23455] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-11-14 07:00:32 [23459] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-11-14 07:00:32 [23455] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-11-14 07:00:32 [23455] [0] *** TIMEOUT: Please check your npcd.cfg
2013-11-14 07:00:32 [23459] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-11-14 07:00:32 [23459] [0] *** TIMEOUT: Please check your npcd.cfg
2013-11-14 07:00:32 [23456] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1384408803.perfdata.service-PID-23456 deleted
2013-11-14 07:00:32 [23456] [0] *** Timeout while processing Host: "hostname" Service: "SRV_Ping"
2013-11-14 07:00:32 [23456] [0] *** process_perfdata.pl terminated on signal ALRM
2013-11-14 07:00:32 [23455] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1384408803.perfdata.host-PID-23455 deleted
2013-11-14 07:00:32 [23455] [0] *** Timeout while processing Host: "hostname" Service: "_HOST_"
2013-11-14 07:00:32 [23459] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1384408818.perfdata.service-PID-23459 deleted
2013-11-14 07:00:32 [23455] [0] *** process_perfdata.pl terminated on signal ALRM
2013-11-14 07:00:32 [23459] [0] *** Timeout while processing Host: "hostname" Service: "SRV_CPU"
2013-11-14 07:00:32 [23459] [0] *** process_perfdata.pl terminated on signal ALRM

There seems to be no activity after 07:00...

ls /usr/local/nagios/var/spool/checkresults | wc -l
9

ls /usr/local/nagios/var/spool/perfdata | wc -l
5907

ls /usr/local/nagios/var/spool/xidpe | wc -l
0

Any help is welcome!

Already did a restart of npcd

service npcd restart

[root@nagios perfdata]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/VolGroup00-LogVol00
1286144 111734 1174410 9% /
tmpfs 490139 1 490138 1% /dev/shm
/dev/sda1 25688 39 25649 1% /boot
cifs:/root_vdm_2/cifs402/Backups/SRVNAGIOS01
257949694 3859915 254089779 2% /var/Digipolis/Backup
[root@nagios perfdata]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
20G 7.4G 11G 41% /
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/sda1 97M 28M 65M 31% /boot
cifs:/root_vdm_2/cifs402/Backups/SRVNAGIOS01
4.9T 2.9T 2.0T 60% /var/Digipolis/Backup

cat /usr/local/nagios/etc/nagios.cfg | grep check_result
check_result_path=/usr/local/nagios/var/spool/checkresults
check_result_reaper_frequency=10
max_check_result_file_age=3600
max_check_result_reaper_time=30

nano /usr/local/nagios/etc/pnp/process_perfdata.cfg
Changed TIMEOUT to 10 and restarted NPCD service.


Tried to gather as much info as I could find.

Re: Perfdata not working anymore

Posted: Fri Nov 15, 2013 4:06 am
by WillemDH
Ok, I think the problem is (at least partially) solved. In http://support.nagios.com/forum/viewtop ... g&start=10 , so I did a shift - refresh and suddenly I saw data again. it could of course also be the service npcd restart I did a few minutes ago..
'm not quite sure though if everything is working normally again.

When I do tail -25 /usr/local/nagios/var/perfdata.log

There seems to be no new log entries since 07:00 this morning... Or is this considered a good thing?

Also tail -50 /usr/local/nagios/var/npcd.log doesn't seem to get any new logs after I restarted the service.
These are the last logs:
[11-14-2013 10:56:58] NPCD: Caught Termination Signal - Hasta la vista... baby
[11-15-2013 09:25:55] NPCD: npcd Daemon (0.4.14) started with PID=6576
[11-15-2013 09:25:55] NPCD: Please have a look at 'npcd -V' to get license information
[11-15-2013 09:25:55] NPCD: HINT: load_threshold is enabled - ('10.000000')

Omg, what a reboot of a server can cause.... Next time I'll certainly try

Code: Select all

service ndo2db stop
service mysqld stop
service postgresql stop
service nagios stop
shutdown -h now
instead of

Code: Select all

shutdown -h now

Re: Perfdata not working anymore

Posted: Fri Nov 15, 2013 10:24 am
by lmiltchev
Having too many files in "/usr/local/nagios/var/spool/perfdata/" directory may indicate that npcd is not running and not processing these files. Make sure the load on the server is not too high (exceeding the load_threshold value, set in the "/usr/local/nagios/etc/pnp/npcd.cfg" file). Also, make sure npcd is set to autostart (after reboot):

Code: Select all

chkconfig --list | grep npcd

Re: Perfdata not working anymore

Posted: Sat May 30, 2015 12:49 pm
by WillemDH
Did not had an issues anymore with npcd since this one, so this thread can be closed. Grtz and tx.