Page 2 of 3

Re: Blank Performance Reports

Posted: Mon Mar 04, 2013 1:10 pm
by kotterbein
nothing is being written to /usr/local/nagios/var/perfdata.log...

/usr/local/nagios/var/npcd.log:

Code: Select all

[03-04-2013 11:46:11] NPCD: npcd Daemon (0.4.14) started with PID=3016
[03-04-2013 11:46:11] NPCD: Please have a look at 'npcd -V' to get license information
[03-04-2013 11:46:11] NPCD: HINT: load_threshold is enabled - ('30.000000')

Re: Blank Performance Reports

Posted: Mon Mar 04, 2013 1:48 pm
by abrist
What is your disk usage at?

Code: Select all

df -h
df -i

Re: Blank Performance Reports

Posted: Mon Mar 04, 2013 2:58 pm
by kotterbein

Code: Select all

Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      246G   69G  165G  30% /
/dev/cciss/c0d0p1      99M   13M   82M  14% /boot
tmpfs                  12G     0   12G   0% /dev/shm
tmpfs                 200M  182M   19M  91% /ramdisk
ic-nfs01.inf.ise.com:/exports/home
                      497G  396G   76G  85% /home
[root@ spool]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/mapper/VolGroup00-LogVol00
                     66519040  265982 66253058    1% /
/dev/cciss/c0d0p1      26104      34   26070    1% /boot
tmpfs                3085286       1 3085285    1% /dev/shm
tmpfs                3085286      12 3085274    1% /ramdisk
nfs.com:/exports/home
                     66076672  429780 65646892    1% /home

Re: Blank Performance Reports

Posted: Mon Mar 04, 2013 3:40 pm
by abrist
Any errors in:

Code: Select all

/usr/local/nagios/var/perfdata.log
/usr/local/nagios/var/npcd.log
/usr/local/nagios/var/nagios.log

Re: Blank Performance Reports

Posted: Mon Mar 04, 2013 3:50 pm
by kotterbein
nothing being written to /usr/local/nagios/var/perfdata.log:

Code: Select all

[b]2013-02-11 11:23:11[/b] [17152] [0] *** Timeout while processing Host: "pc-lvc05" Service: "ISE.MDCX.LVC2_ODS.03p_Msg_per_Sec"
2013-02-11 11:23:11 [17152] [0] *** process_perfdata.pl terminated on signal ALRM
2013-02-11 11:24:20 [19037] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-02-11 11:24:20 [19037] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-02-11 11:24:20 [19037] [0] *** TIMEOUT: Please check your npcd.cfg
2013-02-11 11:24:20 [19037] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//service-perfdata.1360365780-PID-19037 deleted
2013-02-11 11:24:20 [19037] [0] *** Timeout while processing Host: "pc-mgw17" Service: "CPU_Util"
2013-02-11 11:24:20 [19037] [0] *** process_perfdata.pl terminated on signal ALRM
npcd.log:

Code: Select all

[03-04-2013 11:46:11] NPCD: npcd Daemon (0.4.14) started with PID=3016
[03-04-2013 11:46:11] NPCD: Please have a look at 'npcd -V' to get license information
[03-04-2013 11:46:11] NPCD: HINT: load_threshold is enabled - ('30.000000')
nagios.log:

Code: Select all

[1362429990] Error: Unable to update status data file '/ramdisk/status.dat': No space left on device
[1362430000] Error: my_fcopy() failed to write to '/ramdisk/status.dat': No space left on device
[1362430000] Error: Unable to rename file '/usr/local/nagios/var/nagios.tmpy6ZuJQ' to '/ramdisk/status.dat': No space left on device
[1362430000] Error: Unable to update status data file '/ramdisk/status.dat': No space left on device
I expanded the /ramdisk tmpfs.

Re: Blank Performance Reports

Posted: Mon Mar 04, 2013 3:57 pm
by abrist
Well, these file are clean. But it looks like the checkresults are not getting reaped from the ramdisk, st least not fast enough.

Do you notice any of the results in the ramdisk disappearing? Or do they constantly eat up more space? (the should be removed after they are reaped)

Re: Blank Performance Reports

Posted: Mon Mar 04, 2013 4:29 pm
by kotterbein
they are getting reaped:

Code: Select all

[root@ checkresults]# ls -al
total 8
drwxrwxr-x 2 nagios nagios 120 Mar  4 16:28 .
drwxr-xr-x 5 nagios nagios 140 Mar  4 12:46 ..
-rwxrwx--- 1 apache nagcmd 261 Mar  4 16:28 coEpBkj
-rw-r--r-- 1 apache apache   0 Mar  4 16:28 coEpBkj.ok
-rwxrwx--- 1 apache nagcmd 376 Mar  4 16:28 cTOVWIb
-rw-r--r-- 1 apache apache   0 Mar  4 16:28 cTOVWIb.ok
[root@ checkresults]# ls -al
total 0
drwxrwxr-x 2 nagios nagios  40 Mar  4 16:28 .
drwxr-xr-x 5 nagios nagios 140 Mar  4 12:46 ..

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 8:56 am
by scottwilkerson
kotterbein wrote:I have gone back through and repeated the steps from the document sited above... I did have some outstanding configuration issues. one thing I did notice however also is that it has us change the following commands:

Code: Select all

command_name process-host-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/host-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.host
command_name process-service-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/service-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.service
This is the problem, these commands are using /var/nagiosramdisk and you created /var/ramdisk. If you are going to change to /var/ramdisk, you have to do it EVERYWHERE in the document.

kotterbein wrote: there are two other commands I was curious if then needed to be augmented as well:

Code: Select all

command_name process-host-perfdata-file-pnp-bulk
command_line /bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/perfdata/host-perfdata.$TIMET$
command_name process-service-perfdata-file-pnp-bulk
command_line /bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/perfdata/service-perfdata.$TIMET$

No, these aren't being used

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 9:21 am
by kotterbein
scottwilkerson wrote:
kotterbein wrote:I have gone back through and repeated the steps from the document sited above... I did have some outstanding configuration issues. one thing I did notice however also is that it has us change the following commands:

Code: Select all

command_name process-host-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/host-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.host
command_name process-service-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/service-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.service
This is the problem, these commands are using /var/nagiosramdisk and you created /var/ramdisk. If you are going to change to /var/ramdisk, you have to do it EVERYWHERE in the document.
sorry Scott- I pulled these lines from the document, not my config- I have them set properly in my configuration.

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 9:29 am
by kotterbein
overnight, the tmpfs filled (1GB) the files look like they are being reaped, but there is an exceptional amount of data:

Code: Select all

total 971948
drwxr-xr-x 5 nagios nagios       140 Mar  4 12:46 .
drwxrwxrwt 4 nagios nagios       120 Mar  5 09:28 ..
drwxrwxr-x 2 nagios nagios        40 Mar  5 09:28 checkresults
-rw-r--r-- 1 nagios nagios 156223855 Mar  5 09:28 host-perfdata
drwxrwxr-x 2 nagios nagios        60 Mar  5 09:25 perfdata
-rw-r--r-- 1 nagios nagios 837087424 Mar  5 09:28 service-perfdata
drwxrwxr-x 2 nagios nagios        40 Mar  5 09:25 xidpe

I wonder if it is just the amount of data coming in that is a problem.