Page 1 of 1
not showing any graphs after week 10
Posted: Mon Mar 17, 2014 5:07 pm
by srikanth.kallu
For some reason i dont see any graphs after week10, not sure what changed after that.
Can i please get help.
Re: not showing any graphs after week 10
Posted: Tue Mar 18, 2014 9:38 am
by abrist
Lets check your performance data logs:
Code: Select all
tail -15 /usr/local/nagios/var/perfdata.log
tail -15 /usr/local/nagios/var/npcd.log
Re: not showing any graphs after week 10
Posted: Tue Mar 18, 2014 11:28 am
by srikanth.kallu
I just noticed that my /usr is full, is this the reason ?
Re: not showing any graphs after week 10
Posted: Tue Mar 18, 2014 2:30 pm
by lmiltchev
Can you run the following command, as abrisk, asked and show the output in code wraps?
Code: Select all
tail -15 /usr/local/nagios/var/perfdata.log
tail -15 /usr/local/nagios/var/npcd.log
Also, run:
Code: Select all
df -h
df -i
du -a /usr | sort -n -r | head -n 10
and show the output.
Re: not showing any graphs after week 10
Posted: Tue Mar 18, 2014 2:38 pm
by srikanth.kallu
[root@nagiosxi ~]# tail -15 /usr/local/nagios/var/perfdata.log
2013-12-17 10:53:43 [23420] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1387299165.perfdata.service-PID-23420 deleted
2013-12-17 10:53:43 [23420] [0] *** Timeout while processing Host: "stedidevdb" Service: "Total_Processes"
2013-12-17 10:53:43 [23420] [0] *** process_perfdata.pl terminated on signal ALRM
2013-12-17 10:53:43 [23419] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-12-17 10:53:43 [23419] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-12-17 10:53:43 [23419] [0] *** TIMEOUT: Please check your npcd.cfg
2013-12-17 10:53:43 [23419] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1387299165.perfdata.host-PID-23419 deleted
2013-12-17 10:53:44 [23419] [0] *** Timeout while processing Host: "uv01" Service: "_HOST_"
2013-12-17 10:53:44 [23419] [0] *** process_perfdata.pl terminated on signal ALRM
2014-02-01 13:40:25 [13638] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-02-01 13:40:25 [13638] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-02-01 13:40:25 [13638] [0] *** TIMEOUT: Please check your npcd.cfg
2014-02-01 13:40:25 [13638] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1391283606.perfdata.service-PID-13638 deleted
2014-02-01 13:40:25 [13638] [0] *** Timeout while processing Host: "b3sl03" Service: "_var_Disk_Usage"
2014-02-01 13:40:25 [13638] [0] *** process_perfdata.pl terminated on signal ALRM
[root@nagiosxi ~]# tail -15 /usr/local/nagios/var/npcd.log
[12-17-2013 10:52:55] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1387299135.perfdata.service'
[12-17-2013 10:53:43] NPCD: ERROR: Executed command exits with return code '7'
[12-17-2013 10:53:43] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1387299165.perfdata.service'
[12-17-2013 10:53:44] NPCD: ERROR: Executed command exits with return code '7'
[12-17-2013 10:53:44] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1387299165.perfdata.host'
[12-17-2013 10:53:59] NPCD: WARN: MAX load reached: load 17.280000/10.000000 at i=0[12-17-2013 10:54:14] NPCD: WARN: MAX load reached: load 14.540000/10.000000 at i=1[12-17-2013 10:54:29] NPCD: WARN: MAX load reached: load 11.320000/10.000000 at i=1[01-27-2014 20:29:59] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-27-2014 20:30:37] NPCD: npcd Daemon (0.4.14) started with PID=5895
[01-27-2014 20:30:37] NPCD: Please have a look at 'npcd -V' to get license information
[01-27-2014 20:30:37] NPCD: HINT: load_threshold is enabled - ('10.000000')
[02-01-2014 13:40:25] NPCD: ERROR: Executed command exits with return code '7'
[02-01-2014 13:40:25] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1391283606.perfdata.service'
[02-01-2014 13:41:11] NPCD: WARN: MAX load reached: load 10.290000/10.000000 at i=0[02-01-2014 13:41:26] NPCD: WARN: MAX load reached: load 12.180000/10.000000 at i=1[02-01-2014 13:41:41] NPCD: WARN: MAX load reached: load 13.410000/10.000000 at i=1[02-01-2014 13:41:56] NPCD: WARN: MAX load reached: load 14.360000/10.000000 at i=1[02-01-2014 13:42:11] NPCD: WARN: MAX load reached: load 11.770000/10.000000 at i=1[03-07-2014 09:53:01] NPCD: Caught Termination Signal - Hasta la vista... baby
[03-18-2014 11:43:56] NPCD: npcd Daemon (0.4.14) started with PID=1651
[03-18-2014 11:43:56] NPCD: Please have a look at 'npcd -V' to get license information
[03-18-2014 11:43:56] NPCD: HINT: load_threshold is enabled - ('10.000000')
[root@nagiosxi ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_nagiosxi-lv_root 4.0G 1.4G 2.4G 37% /
tmpfs 939M 0 939M 0% /dev/shm
/dev/mapper/vg_nagiosxi-lv_apps 5.0G 3.8G 964M 80% /apps
/dev/sda1 485M 94M 366M 21% /boot
/dev/mapper/vg_nagiosxi-lv_home 4.0G 137M 3.7G 4% /home
/dev/mapper/vg_nagiosxi-lv_tmp 2.0G 157M 1.8G 9% /tmp
/dev/mapper/vg_nagiosxi-lv_usr 4.0G 3.0G 842M 79% /usr
/dev/mapper/vg_nagiosxi-lv_var 6.0G 1.7G 4.0G 30% /var
[root@nagiosxi ~]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/vg_nagiosxi-lv_root 262144 12531 249613 5% /
tmpfs 240304 1 240303 1% /dev/shm
/dev/mapper/vg_nagiosxi-lv_apps 327680 173 327507 1% /apps
/dev/sda1 128016 53 127963 1% /boot
/dev/mapper/vg_nagiosxi-lv_home 262144 23 262121 1% /home
/dev/mapper/vg_nagiosxi-lv_tmp 131072 22825 108247 18% /tmp
/dev/mapper/vg_nagiosxi-lv_usr 262144 76849 185295 30% /usr
/dev/mapper/vg_nagiosxi-lv_var 393216 7525 385691 2% /var
[root@nagiosxi ~]# du -a /usr | sort -n -r | head -n 10
du: cannot access `/usr/local/nagios/share/perfdata/uvdev/_usr_opt_unishared_Disk_Usage.xml.16665': No such file or directory
2917756 /usr
1138608 /usr/local
949500 /usr/local/nagios
828504 /usr/local/nagios/share
818772 /usr/local/nagios/share/perfdata
699272 /usr/share
571096 /usr/lib64
210484 /usr/lib
176716 /usr/bin
175108 /usr/lib64/valgrind
Re: not showing any graphs after week 10
Posted: Tue Mar 18, 2014 3:59 pm
by sreinhardt
Looks like we have a couple things going on here. You are reaching npcd timeout, npcd load limit, and getting pretty close on your free drive space. We can fix the first two pretty easy, the last one you will have to look into.
npcd timeout:
Code: Select all
edit process_perfdata.cfg and look for
TIMEOUT = 5
Increase to at least 15-20
npcd max load:
Code: Select all
exit npcd.cfg and look for
load_threshold = 10.0
Increase this to at least 15-20
Restart the npcd service and see if this resolves your current issues with processing performance data by looking at /usr/local/nagios/var/perfdata.log for "*** TIMEOUT: Timeout after 5 secs. ***" and /usr/local/nagios/var/npcd.log for "WARN: MAX load reached" respectively.
You might still have issues with the process_perfdata command, likely the additional / between the path and file name, you might look at perfdata_spool_dir in npcd.cfg and verify that it only has 1 / at the end.