Page 3 of 3

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 9:46 am
by kotterbein
I think at this point I may back-out the changes that were sending information to ramdisk for perfdata... it seems to be the underlying cause.

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 11:38 am
by abrist
Are we sure they are being reaped? 900000+ files . .. .

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 3:00 pm
by kotterbein
things are just getting stranger now- I realized that the command to mv was incorrect and sourcing a file that was not there, so upon changing to:

process-host-perfdata-file-bulk

Code: Select all

/bin/mv /ramdisk/spool/host-perfdata /ramdisk/spool/xidpe/$TIMET$.perfdata.host
and applying this configuration, files are being created in /ramdisk/spool/xidpe on a regular basis.

but now I see the following issues:

in npcd.log:

Code: Select all

[03-05-2013 14:50:12] NPCD: ERROR: Executed command exits with return code '7'
[03-05-2013 14:50:12] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /ramdisk/spool/perfdata//host-perfdata.1362511776'
[03-05-2013 14:50:42] NPCD: ERROR: Executed command exits with return code '7'
[03-05-2013 14:50:42] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /ramdisk/spool/perfdata//host-perfdata.1362511776'
[03-05-2013 14:51:27] NPCD: ERROR: Executed command exits with return code '7'
[03-05-2013 14:51:27] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /ramdisk/spool/perfdata//host-perfdata.1362511776'
[03-05-2013 14:52:27] NPCD: ERROR: Executed command exits with return code '7'
[03-05-2013 14:52:27] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /ramdisk/spool/perfdata//host-perfdata.1362511776'
[03-05-2013 14:53:27] NPCD: ERROR: Executed command exits with return code '7'
[03-05-2013 14:53:27] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /ramdisk/spool/perfdata//host-perfdata.1362511776'
/ramdisk/spool/perfdata//host-perfdata.1362511776 does not exist.

as well, all of my service checks are updating but nagios has them listed as pending when I drill down to the service check, and none of the nodes are showing their services listed under them.

I've tried doing a full shutdown and restart of nagios processes, but it's so far proven unsuccessful in getting back to a stable state.

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 3:08 pm
by kotterbein
trying to run the process by command line:

Code: Select all

-bash-3.2$ /usr/local/nagios/libexec/process_perfdata.pl -n -b /ramdisk/spool/perfdata//host-perfdata.1362511776
Use of uninitialized value in concatenation (.) or string at /usr/local/nagios/libexec/process_perfdata.pl line 203, <PDFILE> line 24917.
Use of uninitialized value in pattern match (m//) at /usr/local/nagios/libexec/process_perfdata.pl line 210, <PDFILE> line 24917.
Use of uninitialized value in concatenation (.) or string at /usr/local/nagios/libexec/process_perfdata.pl line 216, <PDFILE> line 24917.
Use of uninitialized value in concatenation (.) or string at /usr/local/nagios/libexec/process_perfdata.pl line 216, <PDFILE> line 24917.
Use of uninitialized value in substitution (s///) at /usr/local/nagios/libexec/process_perfdata.pl line 219, <PDFILE> line 24917.
Use of uninitialized value in substitution (s///) at /usr/local/nagios/libexec/process_perfdata.pl line 220, <PDFILE> line 24917.
Use of uninitialized value in substitution (s///) at /usr/local/nagios/libexec/process_perfdata.pl line 221, <PDFILE> line 24917.

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 3:27 pm
by abrist
kotterbein wrote: [03-05-2013 14:50:12] NPCD: ERROR: Executed command exits with return code '7'
[03-05-2013 14:50:12] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /ramdisk/spool/perfdata//host-perfdata.1362511776'
Check/fix the permissions for the perfdata folder:

Code: Select all

chmod -R +x /usr/local/nagios/share/perfdata
Also, take a look at the /usr/local/nagios/var directory, and make sure everything is owned nagios.nagios, and permissions are at least 0664 on all files.

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 3:44 pm
by kotterbein
permissions all look fine. I upped the timeout on the process_perfdata.pl script- looks like it is chomping on it for a while, and timesout:

Code: Select all

10974 nagios    17   0  127m 5332 2428 R 96.2  0.0   3:05.18 process_perfdat
I upped the timeout to 1600 seconds to see if it will go through it, but the file it's processing is rather large, assuming it would take a while:

Code: Select all

-rw-r--r-- 1 nagios nagios 235M Mar  5 15:43 host-perfdata.1362511776

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 4:41 pm
by abrist
hmmm. That seems excessively large.

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 4:58 pm
by kotterbein
looks good now- I did a reboot, cleared the ramdisk of prior data, and am now seeing data- I think there was just too much for it to handle.

Re: Blank Performance Reports

Posted: Tue Mar 05, 2013 5:15 pm
by abrist
Sounds good. You should watch the number and size of files that collect there. They *should* get reaped fairly quickly.