Page 4 of 4

Re: rrd graphs showing zero data for given interval

Posted: Wed Mar 26, 2014 4:07 pm
by jericho_g
Here you go;

2014-03-26 11:09:50 [3607] [0] *** process_perfdata.pl terminated on signal ALRM
2014-03-26 13:34:56 [43308] [0] *** TIMEOUT: Timeout after 30 secs. ***
2014-03-26 13:34:56 [43308] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-03-26 13:34:56 [43308] [0] *** TIMEOUT: Please check your npcd.cfg
2014-03-26 13:34:56 [43308] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1395855251.perfdata.service-PID-43308 deleted
2014-03-26 13:34:56 [43308] [0] *** Timeout while processing Host: "cleswebwin01.advanstar.com" Service: "windows_Virtual_Memory_Prd"
2014-03-26 13:34:56 [43308] [0] *** process_perfdata.pl terminated on signal ALRM
2014-03-26 14:04:38 [46977] [0] *** TIMEOUT: Timeout after 30 secs. ***
2014-03-26 14:04:38 [46977] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-03-26 14:04:38 [46977] [0] *** TIMEOUT: Please check your npcd.cfg
2014-03-26 14:04:38 [46977] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1395857036.perfdata.service-PID-46977 deleted
2014-03-26 14:04:38 [46977] [0] *** Timeout while processing Host: "clerallasa01.advanstar.com" Service: "Port_10_Bandwidth"
2014-03-26 14:04:38 [46977] [0] *** process_perfdata.pl terminated on signal ALRM
2014-03-26 14:05:18 [49795] [0] *** TIMEOUT: Timeout after 30 secs. ***
2014-03-26 14:05:18 [49795] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-03-26 14:05:18 [49795] [0] *** TIMEOUT: Please check your npcd.cfg
2014-03-26 14:05:19 [49795] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1395857051.perfdata.service-PID-49795 deleted
2014-03-26 14:05:19 [49795] [0] *** Timeout while processing Host: "itswallcore1.advanstar.com" Service: "GigabitEthernet0_6_Bandwidth"
2014-03-26 14:05:19 [49795] [0] *** process_perfdata.pl terminated on signal ALRM
2014-03-26 14:09:57 [38398] [0] *** TIMEOUT: Timeout after 30 secs. ***
2014-03-26 14:09:57 [38398] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-03-26 14:09:57 [38398] [0] *** TIMEOUT: Please check your npcd.cfg
2014-03-26 14:09:57 [38398] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1395857351.perfdata.service-PID-38398 deleted
2014-03-26 14:09:57 [38398] [0] *** Timeout while processing Host: "3east5callctr.dul.advanstar.com" Service: "FastEthernet0_14_Bandwidth"
2014-03-26 14:09:57 [38398] [0] *** process_perfdata.pl terminated on signal ALRM
[root@clesitonag1 ~]# tail -25 /usr/local/nagios/var/npcd.log
[03-26-2014 17:06:45] NPCD: Found 4 files in /usr/local/nagios/var/spool/perfdata/
[03-26-2014 17:06:45] NPCD: DEBUG: load 2.330000/40.000000
[03-26-2014 17:06:45] NPCD: ThreadCounter 0/4 File is .
[03-26-2014 17:06:45] NPCD: DEBUG: load 2.330000/40.000000
[03-26-2014 17:06:45] NPCD: ThreadCounter 0/4 File is ..
[03-26-2014 17:06:45] NPCD: DEBUG: load 2.330000/40.000000
[03-26-2014 17:06:45] NPCD: ThreadCounter 0/4 File is 1395868001.perfdata.host
[03-26-2014 17:06:45] NPCD: Regular File: 1395868001.perfdata.host
[03-26-2014 17:06:45] NPCD: A thread was started on thread_counter = 0
[03-26-2014 17:06:45] NPCD: DEBUG: load 2.330000/40.000000
[03-26-2014 17:06:45] NPCD: ThreadCounter 1/4 File is 1395868001.perfdata.service
[03-26-2014 17:06:45] NPCD: Regular File: 1395868001.perfdata.service
[03-26-2014 17:06:45] NPCD: Processing file 1395868001.perfdata.host with ID 140737343776512 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1395868001.perfdata.host
[03-26-2014 17:06:45] NPCD: Processing file '1395868001.perfdata.host'
[03-26-2014 17:06:45] NPCD: A thread was started on thread_counter = 1
[03-26-2014 17:06:45] NPCD: Have to wait: Filecounter = 2 - thread_counter = 2
[03-26-2014 17:06:45] NPCD: Processing file 1395868001.perfdata.service with ID 140737333286656 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1395868001.perfdata.service
[03-26-2014 17:06:45] NPCD: Processing file '1395868001.perfdata.service'
[03-26-2014 17:06:46] NPCD: No more files to process... waiting for 10 seconds
[03-26-2014 17:06:56] NPCD: Found 2 files in /usr/local/nagios/var/spool/perfdata/
[03-26-2014 17:06:56] NPCD: DEBUG: load 4.490000/40.000000
[03-26-2014 17:06:56] NPCD: ThreadCounter 0/4 File is .
[03-26-2014 17:06:56] NPCD: DEBUG: load 4.490000/40.000000
[03-26-2014 17:06:56] NPCD: ThreadCounter 0/4 File is ..
[03-26-2014 17:06:56] NPCD: No more files to process... waiting for 10 seconds

Re: rrd graphs showing zero data for given interval

Posted: Thu Mar 27, 2014 9:52 am
by abrist
You are still hitting the timeout for prefdata. Increase the value from 30 to 60 and then restart npcd. Have you considered using rrdcached?
http://assets.nagios.com/downloads/nagi ... ios_XI.pdf

Re: rrd graphs showing zero data for given interval

Posted: Thu Mar 27, 2014 11:18 am
by jericho_g
I've increased it and will monitor. I think rrdcached might already be enabled. I see this in /tmp:

[root@clesitonag1 tmp]# ls -alrt|grep rrd
-rw-r--r-- 1 rrdcached rrdcached 0 Mar 27 10:21 rrd.journal.1395930094.537011
-rw-r--r-- 1 rrdcached rrdcached 0 Mar 27 11:21 rrd.journal.1395933694.537018

Re: rrd graphs showing zero data for given interval

Posted: Thu Mar 27, 2014 11:48 am
by abrist
Great. Keep us informed.

Re: rrd graphs showing zero data for given interval

Posted: Tue Apr 01, 2014 1:49 pm
by jericho_g
Graphs were working smoothly until a few hiccups today. The log output shows timeouts after 30 seconds, but my perfdata config is already set to 60. Any suggestions?

[root@clesitonag1 ~]# tail -25 /usr/local/nagios/var/perfdata.log
2014-03-27 06:04:52 [33208] [0] *** process_perfdata.pl terminated on signal ALRM
2014-03-27 07:04:49 [44156] [0] *** TIMEOUT: Timeout after 30 secs. ***
2014-03-27 07:04:49 [44156] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-03-27 07:04:49 [44156] [0] *** TIMEOUT: Please check your npcd.cfg
2014-03-27 07:04:49 [44156] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1395918251.perfdata.service-PID-44156 deleted
2014-03-27 07:04:49 [44156] [0] *** Timeout while processing Host: "isexallfap01.advanstar.com" Service: "windows_Virtual_Memory_Prd"
2014-03-27 07:04:49 [44156] [0] *** process_perfdata.pl terminated on signal ALRM
2014-03-27 08:04:56 [55417] [0] *** TIMEOUT: Timeout after 30 secs. ***
2014-03-27 08:04:56 [55417] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-03-27 08:04:56 [55417] [0] *** TIMEOUT: Please check your npcd.cfg
2014-03-27 08:04:56 [55417] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1395921851.perfdata.service-PID-55417 deleted
2014-03-27 08:04:56 [55417] [0] *** Timeout while processing Host: "cleswebvmw05.advanstar.com" Service: "CPU_Usage_for_VMHost"
2014-03-27 08:04:56 [55417] [0] *** process_perfdata.pl terminated on signal ALRM
2014-03-27 08:34:50 [60979] [0] *** TIMEOUT: Timeout after 30 secs. ***
2014-03-27 08:34:50 [60979] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-03-27 08:34:50 [60979] [0] *** TIMEOUT: Please check your npcd.cfg
2014-03-27 08:34:50 [60979] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1395923651.perfdata.service-PID-60979 deleted
2014-03-27 08:34:50 [60979] [0] *** Timeout while processing Host: "clesitovmw07" Service: "Memory_for_VMHost"
2014-03-27 08:34:50 [60979] [0] *** process_perfdata.pl terminated on signal ALRM
2014-03-27 12:04:50 [38431] [0] *** TIMEOUT: Timeout after 30 secs. ***
2014-03-27 12:04:50 [38431] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-03-27 12:04:50 [38431] [0] *** TIMEOUT: Please check your npcd.cfg
2014-03-27 12:04:50 [38431] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1395936251.perfdata.service-PID-38431 deleted
2014-03-27 12:04:50 [38431] [0] *** Timeout while processing Host: "cleswebvmw05.advanstar.com" Service: "Input___Output_for_VMHost"
2014-03-27 12:04:50 [38431] [0] *** process_perfdata.pl terminated on signal ALRM
[root@clesitonag1 ~]# tail -25 /usr/local/nagios/var/npcd.log
[04-01-2014 12:22:17] NPCD: Have to wait: Filecounter = 2 - thread_counter = 2
[04-01-2014 12:22:17] NPCD: Processing file 1396369322.perfdata.host with ID 140737343776512 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1396369322.perfdata.host
[04-01-2014 12:22:17] NPCD: Processing file '1396369322.perfdata.host'
[04-01-2014 12:22:17] NPCD: Processing file 1396369322.perfdata.service with ID 140737333286656 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1396369322.perfdata.service
[04-01-2014 12:22:17] NPCD: Processing file '1396369322.perfdata.service'
[04-01-2014 12:22:18] NPCD: No more files to process... waiting for 10 seconds
[04-01-2014 12:22:28] NPCD: Found 4 files in /usr/local/nagios/var/spool/perfdata/
[04-01-2014 12:22:28] NPCD: DEBUG: load 5.850000/40.000000
[04-01-2014 12:22:28] NPCD: ThreadCounter 0/4 File is .
[04-01-2014 12:22:28] NPCD: DEBUG: load 5.850000/40.000000
[04-01-2014 12:22:28] NPCD: ThreadCounter 0/4 File is ..
[04-01-2014 12:22:28] NPCD: DEBUG: load 5.850000/40.000000
[04-01-2014 12:22:28] NPCD: ThreadCounter 0/4 File is 1396369337.perfdata.host
[04-01-2014 12:22:28] NPCD: Regular File: 1396369337.perfdata.host
[04-01-2014 12:22:28] NPCD: A thread was started on thread_counter = 0
[04-01-2014 12:22:28] NPCD: DEBUG: load 5.850000/40.000000
[04-01-2014 12:22:28] NPCD: ThreadCounter 1/4 File is 1396369337.perfdata.service
[04-01-2014 12:22:28] NPCD: Processing file 1396369337.perfdata.host with ID 140737343776512 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1396369337.perfdata.host
[04-01-2014 12:22:28] NPCD: Regular File: 1396369337.perfdata.service
[04-01-2014 12:22:28] NPCD: Processing file '1396369337.perfdata.host'
[04-01-2014 12:22:28] NPCD: A thread was started on thread_counter = 1
[04-01-2014 12:22:28] NPCD: Have to wait: Filecounter = 2 - thread_counter = 2
[04-01-2014 12:22:28] NPCD: Processing file 1396369337.perfdata.service with ID 140737333286656 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1396369337.perfdata.service
[04-01-2014 12:22:28] NPCD: Processing file '1396369337.perfdata.service'
[04-01-2014 12:22:33] NPCD: No more files to process... waiting for 10 seconds

Re: rrd graphs showing zero data for given interval

Posted: Tue Apr 01, 2014 3:40 pm
by abrist
Was npcd restarted after the change to 60?

Code: Select all

service npcd restart

Re: rrd graphs showing zero data for given interval

Posted: Tue Apr 01, 2014 3:52 pm
by jericho_g
Yes. I went ahead and restarted it again, to be sure.

Re: rrd graphs showing zero data for given interval

Posted: Tue Apr 01, 2014 4:04 pm
by abrist
Just to be sure:

Code: Select all

grep TIMEOUT /usr/local/nagios/etc/pnp/process_perfdata.cfg
Lets kill npcd completely and restart npcd and nagios:

Code: Select all

service npcd stop
killall npcd 
service nagios restart
service npcd start

Re: rrd graphs showing zero data for given interval

Posted: Tue Apr 29, 2014 11:06 am
by jericho_g
Looks good now. No timeouts seen since 3/27. Thanks!

-Jericho

Re: rrd graphs showing zero data for given interval

Posted: Tue Apr 29, 2014 12:16 pm
by tmcdonald
Good to hear! I'll be closing this thread now, but feel free to open another if you need anything in the future!