Page 2 of 3
Re: Blank Performance Reports
Posted: Mon Mar 04, 2013 1:10 pm
by kotterbein
nothing is being written to /usr/local/nagios/var/perfdata.log...
/usr/local/nagios/var/npcd.log:
Code: Select all
[03-04-2013 11:46:11] NPCD: npcd Daemon (0.4.14) started with PID=3016
[03-04-2013 11:46:11] NPCD: Please have a look at 'npcd -V' to get license information
[03-04-2013 11:46:11] NPCD: HINT: load_threshold is enabled - ('30.000000')
Re: Blank Performance Reports
Posted: Mon Mar 04, 2013 1:48 pm
by abrist
What is your disk usage at?
Re: Blank Performance Reports
Posted: Mon Mar 04, 2013 2:58 pm
by kotterbein
Code: Select all
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
246G 69G 165G 30% /
/dev/cciss/c0d0p1 99M 13M 82M 14% /boot
tmpfs 12G 0 12G 0% /dev/shm
tmpfs 200M 182M 19M 91% /ramdisk
ic-nfs01.inf.ise.com:/exports/home
497G 396G 76G 85% /home
[root@ spool]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/VolGroup00-LogVol00
66519040 265982 66253058 1% /
/dev/cciss/c0d0p1 26104 34 26070 1% /boot
tmpfs 3085286 1 3085285 1% /dev/shm
tmpfs 3085286 12 3085274 1% /ramdisk
nfs.com:/exports/home
66076672 429780 65646892 1% /home
Re: Blank Performance Reports
Posted: Mon Mar 04, 2013 3:40 pm
by abrist
Any errors in:
Code: Select all
/usr/local/nagios/var/perfdata.log
/usr/local/nagios/var/npcd.log
/usr/local/nagios/var/nagios.log
Re: Blank Performance Reports
Posted: Mon Mar 04, 2013 3:50 pm
by kotterbein
nothing being written to /usr/local/nagios/var/perfdata.log:
Code: Select all
[b]2013-02-11 11:23:11[/b] [17152] [0] *** Timeout while processing Host: "pc-lvc05" Service: "ISE.MDCX.LVC2_ODS.03p_Msg_per_Sec"
2013-02-11 11:23:11 [17152] [0] *** process_perfdata.pl terminated on signal ALRM
2013-02-11 11:24:20 [19037] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-02-11 11:24:20 [19037] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-02-11 11:24:20 [19037] [0] *** TIMEOUT: Please check your npcd.cfg
2013-02-11 11:24:20 [19037] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//service-perfdata.1360365780-PID-19037 deleted
2013-02-11 11:24:20 [19037] [0] *** Timeout while processing Host: "pc-mgw17" Service: "CPU_Util"
2013-02-11 11:24:20 [19037] [0] *** process_perfdata.pl terminated on signal ALRM
npcd.log:
Code: Select all
[03-04-2013 11:46:11] NPCD: npcd Daemon (0.4.14) started with PID=3016
[03-04-2013 11:46:11] NPCD: Please have a look at 'npcd -V' to get license information
[03-04-2013 11:46:11] NPCD: HINT: load_threshold is enabled - ('30.000000')
nagios.log:
Code: Select all
[1362429990] Error: Unable to update status data file '/ramdisk/status.dat': No space left on device
[1362430000] Error: my_fcopy() failed to write to '/ramdisk/status.dat': No space left on device
[1362430000] Error: Unable to rename file '/usr/local/nagios/var/nagios.tmpy6ZuJQ' to '/ramdisk/status.dat': No space left on device
[1362430000] Error: Unable to update status data file '/ramdisk/status.dat': No space left on device
I expanded the /ramdisk tmpfs.
Re: Blank Performance Reports
Posted: Mon Mar 04, 2013 3:57 pm
by abrist
Well, these file are clean. But it looks like the checkresults are not getting reaped from the ramdisk, st least not fast enough.
Do you notice any of the results in the ramdisk disappearing? Or do they constantly eat up more space? (the should be removed after they are reaped)
Re: Blank Performance Reports
Posted: Mon Mar 04, 2013 4:29 pm
by kotterbein
they are getting reaped:
Code: Select all
[root@ checkresults]# ls -al
total 8
drwxrwxr-x 2 nagios nagios 120 Mar 4 16:28 .
drwxr-xr-x 5 nagios nagios 140 Mar 4 12:46 ..
-rwxrwx--- 1 apache nagcmd 261 Mar 4 16:28 coEpBkj
-rw-r--r-- 1 apache apache 0 Mar 4 16:28 coEpBkj.ok
-rwxrwx--- 1 apache nagcmd 376 Mar 4 16:28 cTOVWIb
-rw-r--r-- 1 apache apache 0 Mar 4 16:28 cTOVWIb.ok
[root@ checkresults]# ls -al
total 0
drwxrwxr-x 2 nagios nagios 40 Mar 4 16:28 .
drwxr-xr-x 5 nagios nagios 140 Mar 4 12:46 ..
Re: Blank Performance Reports
Posted: Tue Mar 05, 2013 8:56 am
by scottwilkerson
kotterbein wrote:I have gone back through and repeated the steps from the document sited above... I did have some outstanding configuration issues. one thing I did notice however also is that it has us change the following commands:
Code: Select all
command_name process-host-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/host-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.host
command_name process-service-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/service-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.service
This is the problem, these commands are using /var/nagiosramdisk and you created /var/ramdisk. If you are going to change to /var/ramdisk, you have to do it EVERYWHERE in the document.
kotterbein wrote:
there are two other commands I was curious if then needed to be augmented as well:
Code: Select all
command_name process-host-perfdata-file-pnp-bulk
command_line /bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/perfdata/host-perfdata.$TIMET$
command_name process-service-perfdata-file-pnp-bulk
command_line /bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/perfdata/service-perfdata.$TIMET$
No, these aren't being used
Re: Blank Performance Reports
Posted: Tue Mar 05, 2013 9:21 am
by kotterbein
scottwilkerson wrote:kotterbein wrote:I have gone back through and repeated the steps from the document sited above... I did have some outstanding configuration issues. one thing I did notice however also is that it has us change the following commands:
Code: Select all
command_name process-host-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/host-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.host
command_name process-service-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/service-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.service
This is the problem, these commands are using /var/nagiosramdisk and you created /var/ramdisk. If you are going to change to /var/ramdisk, you have to do it EVERYWHERE in the document.
sorry Scott- I pulled these lines from the document, not my config- I have them set properly in my configuration.
Re: Blank Performance Reports
Posted: Tue Mar 05, 2013 9:29 am
by kotterbein
overnight, the tmpfs filled (1GB) the files look like they are being reaped, but there is an exceptional amount of data:
Code: Select all
total 971948
drwxr-xr-x 5 nagios nagios 140 Mar 4 12:46 .
drwxrwxrwt 4 nagios nagios 120 Mar 5 09:28 ..
drwxrwxr-x 2 nagios nagios 40 Mar 5 09:28 checkresults
-rw-r--r-- 1 nagios nagios 156223855 Mar 5 09:28 host-perfdata
drwxrwxr-x 2 nagios nagios 60 Mar 5 09:25 perfdata
-rw-r--r-- 1 nagios nagios 837087424 Mar 5 09:28 service-perfdata
drwxrwxr-x 2 nagios nagios 40 Mar 5 09:25 xidpe
I wonder if it is just the amount of data coming in that is a problem.