Page 1 of 2
Performance graph is not working with check_nwc_health
Posted: Thu Jun 04, 2015 12:51 am
by phyo
Hi,
We are using Nagios XI 2014R1.0 on RHEL 6.6 64 bits. We monitor the CPU usage of ASA 5585 with check_nwc_health plugin. Previously we can see the performance graph of the CPU usage. Just now the customer report that the performance graph is not working anymore. I have attached the check_nwc_health plugin and the performance graph of the CPU usage.
Please take a look and let me know if you found how to get back the performance graph.
check_nwc_health plugin :
https://db.tt/9xnuI0lP

Re: Performance graph is not working with check_nwc_health
Posted: Thu Jun 04, 2015 2:02 am
by Box293
Do you mean that the graph is squashed or that the usage just seems to be 0 ?
Re: Performance graph is not working with check_nwc_health
Posted: Thu Jun 04, 2015 2:19 am
by phyo
Box293 wrote:Do you mean that the graph is squashed or that the usage just seems to be 0 ?
That squashed was the old record for 2014. Means the last performance graph was 2014 July 29. The status is as you can see in the picture the cpu_processor_0_2 is 1. After july 29 2014, there is no performance graph.
It was very strange. And I also add another test service for CPU using the same plugin. But the new test service don't have the performance graph.
Re: Performance graph is not working with check_nwc_health
Posted: Thu Jun 04, 2015 10:24 am
by abrist
Can you increase the debug/log levels for npcd and perfdata, get a large tail of the logs, and then post them here is code wraps?
Increase debug/log levels:
https://support.nagios.com/wiki/index.p ... h_Problems
Then get a large tail and post them here:
Code: Select all
tail -25 /usr/local/nagios/var/npcd.log
tail -25 /usr/local/nagios/var/perfdata.log
Re: Performance graph is not working with check_nwc_health
Posted: Thu Jun 04, 2015 8:52 pm
by phyo
Code: Select all
tail -25 /usr/local/nagios/var/npcd.log
Code: Select all
[root@nagios01 ~]# tail -25 /usr/local/nagios/var/npcd.log
[06-05-2015 09:46:19] NPCD: Have to wait: Filecounter = 3 - thread_counter = 2
[06-05-2015 09:46:19] NPCD: Processing file '1433468771.perfdata.service'
[06-05-2015 09:46:21] NPCD: No more files to process... waiting for 15 seconds
[06-05-2015 09:46:36] NPCD: Found 5 files in /usr/local/nagios/var/spool/perfdata/
[06-05-2015 09:46:36] NPCD: DEBUG: load 0.890000/10.000000
[06-05-2015 09:46:36] NPCD: ThreadCounter 0/5 File is .
[06-05-2015 09:46:36] NPCD: DEBUG: load 0.890000/10.000000
[06-05-2015 09:46:36] NPCD: ThreadCounter 0/5 File is ..
[06-05-2015 09:46:36] NPCD: DEBUG: load 0.890000/10.000000
[06-05-2015 09:46:36] NPCD: ThreadCounter 0/5 File is 1423458317.perfdata.service-PID-40255
[06-05-2015 09:46:36] NPCD: File '1423458317.perfdata.service-PID-40255' is an already in process PNP file. Leaving it untouched.
[06-05-2015 09:46:36] NPCD: DEBUG: load 0.890000/10.000000
[06-05-2015 09:46:36] NPCD: ThreadCounter 0/5 File is 1433468786.perfdata.host
[06-05-2015 09:46:36] NPCD: Regular File: 1433468786.perfdata.host
[06-05-2015 09:46:36] NPCD: A thread was started on thread_counter = 0
[06-05-2015 09:46:36] NPCD: DEBUG: load 0.890000/10.000000
[06-05-2015 09:46:36] NPCD: Processing file 1433468786.perfdata.host with ID 139971462743808 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1433468786.perfdata.host
[06-05-2015 09:46:36] NPCD: ThreadCounter 1/5 File is 1433468786.perfdata.service
[06-05-2015 09:46:36] NPCD: Processing file '1433468786.perfdata.host'
[06-05-2015 09:46:36] NPCD: Regular File: 1433468786.perfdata.service
[06-05-2015 09:46:36] NPCD: A thread was started on thread_counter = 1
[06-05-2015 09:46:36] NPCD: Processing file 1433468786.perfdata.service with ID 139971452253952 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1433468786.perfdata.service
[06-05-2015 09:46:36] NPCD: Have to wait: Filecounter = 3 - thread_counter = 2
[06-05-2015 09:46:36] NPCD: Processing file '1433468786.perfdata.service'
[06-05-2015 09:46:38] NPCD: No more files to process... waiting for 15 seconds
[root@nagios01 ~]#
Code: Select all
tail -25 /usr/local/nagios/var/perfdata.log
Code: Select all
[root@nagios01 ~]# tail -25 /usr/local/nagios/var/perfdata.log
2015-06-05 09:46:53 [22163] [2] Template is check_xi_host_ping.php
2015-06-05 09:46:53 [22163] [2] No Custom Template found for check_xi_host_ping (/usr/local/nagios/etc/pnp/check_commands/check_xi_host_ping.cfg)
2015-06-05 09:46:53 [22163] [2] Template is check_xi_host_ping.php
2015-06-05 09:46:53 [22163] [2] No Custom Template found for check_xi_host_ping (/usr/local/nagios/etc/pnp/check_commands/check_xi_host_ping.cfg)
2015-06-05 09:46:53 [22163] [2] Template is check_xi_host_ping.php
2015-06-05 09:46:53 [22163] [2] data2rrd called
2015-06-05 09:46:53 [22163] [2] RRDs::update /usr/local/nagios/share/perfdata/ProStream_9K-Live2/_HOST_.rrd 1433468801:0.205:0:0.324:0.165
2015-06-05 09:46:53 [22163] [2] /usr/local/nagios/share/perfdata/ProStream_9K-Live2/_HOST_.rrd updated
2015-06-05 09:46:53 [22163] [2] Processing Line 23
2015-06-05 09:46:53 [22163] [2] Datatype set to 'HOSTPERFDATA'
2015-06-05 09:46:53 [22163] [1] Found Performance Data for Nexus3K-02 / _HOST_ (rta=0.374ms;3000.000;5000.000;0; pl=0%;80;100;; rtmax=0.459ms;;;; rtmin=0.306ms;;;;)
2015-06-05 09:46:53 [22163] [2] No Custom Template found for check_xi_host_ping (/usr/local/nagios/etc/pnp/check_commands/check_xi_host_ping.cfg)
2015-06-05 09:46:53 [22163] [2] Template is check_xi_host_ping.php
2015-06-05 09:46:53 [22163] [2] No Custom Template found for check_xi_host_ping (/usr/local/nagios/etc/pnp/check_commands/check_xi_host_ping.cfg)
2015-06-05 09:46:53 [22163] [2] Template is check_xi_host_ping.php
2015-06-05 09:46:53 [22163] [2] No Custom Template found for check_xi_host_ping (/usr/local/nagios/etc/pnp/check_commands/check_xi_host_ping.cfg)
2015-06-05 09:46:53 [22163] [2] Template is check_xi_host_ping.php
2015-06-05 09:46:53 [22163] [2] No Custom Template found for check_xi_host_ping (/usr/local/nagios/etc/pnp/check_commands/check_xi_host_ping.cfg)
2015-06-05 09:46:53 [22163] [2] Template is check_xi_host_ping.php
2015-06-05 09:46:53 [22163] [2] data2rrd called
2015-06-05 09:46:53 [22163] [2] RRDs::update /usr/local/nagios/share/perfdata/Nexus3K-02/_HOST_.rrd 1433468801:0.374:0:0.459:0.306
2015-06-05 09:46:53 [22163] [2] /usr/local/nagios/share/perfdata/Nexus3K-02/_HOST_.rrd updated
2015-06-05 09:46:53 [22163] [1] 23 lines processed
2015-06-05 09:46:53 [22163] [1] /usr/local/nagios/var/spool/perfdata//1433468801.perfdata.host-PID-22163 deleted
2015-06-05 09:46:53 [22163] [1] PNP exiting (runtime 0.037512s) ...
[root@nagios01 ~]#
Re: Performance graph is not working with check_nwc_health
Posted: Fri Jun 05, 2015 1:48 am
by Box293
Perhaps try using 250 instead of 25, it should give us some more logs
tail -250
Re: Performance graph is not working with check_nwc_health
Posted: Fri Jun 05, 2015 3:06 am
by phyo
Box293 wrote:Perhaps try using 250 instead of 25, it should give us some more logs
tail -250
I have attached the file.
Re: Performance graph is not working with check_nwc_health
Posted: Fri Jun 05, 2015 9:02 am
by lmiltchev
Please, show us the problem service's config and the actual command run from the command line along with the output of it.
Also run the following commands and show us the output:
Code: Select all
rrdtool --version
cat /usr/local/nagios/etc/pnp/pnp4nagios_release
ll /usr/local/nagios
Re: Performance graph is not working with check_nwc_health
Posted: Sun Jun 07, 2015 8:47 pm
by phyo
lmiltchev wrote:Please, show us the problem service's config and the actual command run from the command line along with the output of it.
Code: Select all
[root@nagios01 libexec]# ./check_nwc_health --hostname 10.110.50.17 --protocol 2c --community nagios --mode cpu-load --warning 70% --critical 80%
OK - cpu Chassis usage (5 min avg.) is 0.00%, cpu Chassis usage (5 min avg.) is 0.00%, cpu Processor 0/0 usage (5 min avg.) is 0.00%, cpu Processor 0/1 usage (5 min avg.) is 0.00%, cpu Processor 0/2 usage (5 min avg.) is 0.00%, cpu Processor 0/3 usage (5 min avg.) is 0.00%, cpu Processor 0/4 usage (5 min avg.) is 0.00%, cpu Processor 0/5 usage (5 min avg.) is 1.00%, cpu Processor 0/6 usage (5 min avg.) is 0.00%, cpu Processor 0/7 usage (5 min avg.) is 0.00% | 'cpu_Chassis_usage'=0%;70%;80% 'cpu_Chassis_usage'=0%;70%;80% 'cpu_Processor 0/0_usage'=0%;70%;80% 'cpu_Processor 0/1_usage'=0%;70%;80% 'cpu_Processor 0/2_usage'=0%;70%;80% 'cpu_Processor 0/3_usage'=0%;70%;80% 'cpu_Processor 0/4_usage'=0%;70%;80% 'cpu_Processor 0/5_usage'=1%;70%;80% 'cpu_Processor 0/6_usage'=0%;70%;80% 'cpu_Processor 0/7_usage'=0%;70%;80%
[root@nagios01 libexec]#
lmiltchev wrote:Also run the following commands and show us the output:
Code: Select all
rrdtool --version
cat /usr/local/nagios/etc/pnp/pnp4nagios_release
ll /usr/local/nagios
Code: Select all
[root@nagios01 libexec]# rrdtool --version
RRDtool 1.3.8 Copyright 1997-2009 by Tobias Oetiker <[email protected]>
Compiled Feb 20 2014 11:59:48
Usage: rrdtool [options] command command_options
Valid commands: create, update, updatev, graph, graphv, dump, restore,
last, lastupdate, first, info, fetch, tune,
resize, xport
RRDtool is distributed under the Terms of the GNU General
Public License Version 2. (www.gnu.org/copyleft/gpl.html)
For more information read the RRD manpages
[root@nagios01 libexec]#
Code: Select all
[root@nagios01 libexec]# cat /usr/local/nagios/etc/pnp/pnp4nagios_release
PKG_REL_DATE="05-02-2009"
PKG_VERSION="0.4.14"
PKG_NAME="pnp"
[root@nagios01 libexec]#
Code: Select all
[root@nagios01 libexec]# ll /usr/local/nagios/
total 76
drwxrwxr-x. 2 nagios nagios 4096 May 21 2014 bin
drwsrwsr-x. 7 apache nagios 4096 May 16 21:08 etc
drwxr-xr-x 2 root root 4096 May 21 2014 include
drwxrwxr-x. 3 nagios nagios 12288 Jun 5 14:14 libexec
drwxrwxr-x. 2 nagios nagios 4096 May 21 2014 sbin
drwxrwxr-x. 14 nagios nagios 4096 May 21 2014 share
drwxr-xr-x 2 nagios root 36864 Dec 5 2014 tmp
drwxrwxr-x. 6 nagios nagios 4096 Jun 8 09:42 var
[root@nagios01 libexec]#
Re: Performance graph is not working with check_nwc_health
Posted: Mon Jun 08, 2015 9:43 am
by tgriep
It looks like the check is returning strange data for all of the cpu load checks. Try running the following with verbose turned on and post the results back here.
Code: Select all
./check_nwc_health --hostname 10.110.50.17 --protocol 2c --community nagios --mode cpu-load --warning 70% --critical 80% -vvv
./check_nwc_health --version