Perfdata not working anymore

Post by **WillemDH** » Fri Nov 15, 2013 3:31 am

Hello,

After my issue after the reboot yesterday, it seems our graphs don't get any data anymore!

tail -25 /usr/local/nagios/var/perfdata.log

2013-11-14 07:00:32 [23456] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-11-14 07:00:32 [23456] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-11-14 07:00:32 [23456] [0] *** TIMEOUT: Please check your npcd.cfg
2013-11-14 07:00:32 [23455] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-11-14 07:00:32 [23459] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-11-14 07:00:32 [23455] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-11-14 07:00:32 [23455] [0] *** TIMEOUT: Please check your npcd.cfg
2013-11-14 07:00:32 [23459] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-11-14 07:00:32 [23459] [0] *** TIMEOUT: Please check your npcd.cfg
2013-11-14 07:00:32 [23456] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1384408803.perfdata.service-PID-23456 deleted
2013-11-14 07:00:32 [23456] [0] *** Timeout while processing Host: "hostname" Service: "SRV_Ping"
2013-11-14 07:00:32 [23456] [0] *** process_perfdata.pl terminated on signal ALRM
2013-11-14 07:00:32 [23455] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1384408803.perfdata.host-PID-23455 deleted
2013-11-14 07:00:32 [23455] [0] *** Timeout while processing Host: "hostname" Service: "_HOST_"
2013-11-14 07:00:32 [23459] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1384408818.perfdata.service-PID-23459 deleted
2013-11-14 07:00:32 [23455] [0] *** process_perfdata.pl terminated on signal ALRM
2013-11-14 07:00:32 [23459] [0] *** Timeout while processing Host: "hostname" Service: "SRV_CPU"
2013-11-14 07:00:32 [23459] [0] *** process_perfdata.pl terminated on signal ALRM

There seems to be no activity after 07:00...

ls /usr/local/nagios/var/spool/checkresults | wc -l
9

ls /usr/local/nagios/var/spool/perfdata | wc -l
5907

ls /usr/local/nagios/var/spool/xidpe | wc -l
0

Any help is welcome!

Already did a restart of npcd

service npcd restart

[root@nagios perfdata]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/VolGroup00-LogVol00
1286144 111734 1174410 9% /
tmpfs 490139 1 490138 1% /dev/shm
/dev/sda1 25688 39 25649 1% /boot
cifs:/root_vdm_2/cifs402/Backups/SRVNAGIOS01
257949694 3859915 254089779 2% /var/Digipolis/Backup
[root@nagios perfdata]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
20G 7.4G 11G 41% /
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/sda1 97M 28M 65M 31% /boot
cifs:/root_vdm_2/cifs402/Backups/SRVNAGIOS01
4.9T 2.9T 2.0T 60% /var/Digipolis/Backup

cat /usr/local/nagios/etc/nagios.cfg | grep check_result
check_result_path=/usr/local/nagios/var/spool/checkresults
check_result_reaper_frequency=10
max_check_result_file_age=3600
max_check_result_reaper_time=30

nano /usr/local/nagios/etc/pnp/process_perfdata.cfg
Changed TIMEOUT to 10 and restarted NPCD service.

Tried to gather as much info as I could find.

Post by **WillemDH** » Fri Nov 15, 2013 4:06 am

Ok, I think the problem is (at least partially) solved. In http://support.nagios.com/forum/viewtop ... g&start=10 , so I did a shift - refresh and suddenly I saw data again. it could of course also be the service npcd restart I did a few minutes ago..
'm not quite sure though if everything is working normally again.

When I do tail -25 /usr/local/nagios/var/perfdata.log

There seems to be no new log entries since 07:00 this morning... Or is this considered a good thing?

Also tail -50 /usr/local/nagios/var/npcd.log doesn't seem to get any new logs after I restarted the service.
These are the last logs:
[11-14-2013 10:56:58] NPCD: Caught Termination Signal - Hasta la vista... baby
[11-15-2013 09:25:55] NPCD: npcd Daemon (0.4.14) started with PID=6576
[11-15-2013 09:25:55] NPCD: Please have a look at 'npcd -V' to get license information
[11-15-2013 09:25:55] NPCD: HINT: load_threshold is enabled - ('10.000000')

Omg, what a reboot of a server can cause.... Next time I'll certainly try

Code: Select all

service ndo2db stop
service mysqld stop
service postgresql stop
service nagios stop
shutdown -h now

instead of

Code: Select all

shutdown -h now

Post by **lmiltchev** » Fri Nov 15, 2013 10:24 am

Having too many files in "/usr/local/nagios/var/spool/perfdata/" directory may indicate that npcd is not running and not processing these files. Make sure the load on the server is not too high (exceeding the load_threshold value, set in the "/usr/local/nagios/etc/pnp/npcd.cfg" file). Also, make sure npcd is set to autostart (after reboot):

Code: Select all

chkconfig --list | grep npcd

Post by **WillemDH** » Sat May 30, 2015 12:49 pm

Did not had an issues anymore with npcd since this one, so this thread can be closed. Grtz and tx.

Nagios Support Forum

Perfdata not working anymore

Perfdata not working anymore

Re: Perfdata not working anymore

Re: Perfdata not working anymore

Re: Perfdata not working anymore