This support forum board is for support questions relating to
Nagios XI , our flagship commercial network monitoring solution.
AWS
Posts: 63 Joined: Fri May 13, 2011 4:33 pm
Location: Vancouver, WA
Post
by AWS » Tue Sep 10, 2013 12:56 pm
CentOS release 5.9 (Final) / Linux 2.6.18-348.16.1.el5 i686
32bit
2012R2.3
Manual Install of XI
Are you using a proxy: transparent
Are you using SSL: No
*****************************************************
I've walked through the forum and found multiple issues like mine and I've applied the suggestions there in, but I'm still having trouble. My performance graphs aren't reporting data because the data isn't being collected. (?) Here are some screen shots:
Performance Graphs
Host Graph
I've tried editing:
/usr/local/nagios/etc/pnp/npcd.cfg and changed
load_threshold = 10.0 to
load_threshold = 30.0
/usr/local/nagios/etc/pnp/process_perfdata.cfg and changed
TIMEOUT = 5 to
TIMEOUT = 20
restarted the npcd service, waited an hour and saw no rrd graph data change.
I've looked at the
/usr/local/nagios/share/perfdata/<host to monitor> directory and notice that there hasn't been anything written to any directories since May 9, 2013.
What else do I need to do?
Thank you.
AWS
sreinhardt
-fno-stack-protector
Posts: 4366 Joined: Mon Nov 19, 2012 12:10 pm
Post
by sreinhardt » Tue Sep 10, 2013 1:51 pm
Let's make sure everything is running and that there are not major issues preventing perfdata processing.
Code: Select all
ps aux | grep npcd
service npcd status
ll /usr/local/nagios/share/perfdata/
ll /usr/local/nagios/var/spool/perfdata/ | wc -l
ll /usr/local/nagios/var/spool/perfdata/
ll -d /usr/local/nagios/var/spool/perfdata/
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
abrist
Red Shirt
Posts: 8334 Joined: Thu Nov 15, 2012 1:20 pm
Post
by abrist » Tue Sep 10, 2013 1:51 pm
First, lets check the relevant logs:
Code: Select all
tail -25 /usr/local/nagios/var/perfdata.log
tail -25 /usr/local/nagios/var/npcd.log
Post the output in code wraps.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the
Dark Side .
AWS
Posts: 63 Joined: Fri May 13, 2011 4:33 pm
Location: Vancouver, WA
Post
by AWS » Tue Sep 10, 2013 1:57 pm
sreinhardt wrote: Let's make sure everything is running and that there are not major issues preventing perfdata processing.
Code: Select all
ps aux | grep npcd
service npcd status
ll /usr/local/nagios/share/perfdata/
ll /usr/local/nagios/var/spool/perfdata/ | wc -l
ll /usr/local/nagios/var/spool/perfdata/
ll -d /usr/local/nagios/var/spool/perfdata/
Code: Select all
[root@nagios ~]# ps aux | grep npcd
root 7165 0.0 0.0 4032 732 pts/3 S+ 11:52 0:00 grep npcd
nagios 8026 0.0 0.0 1924 696 ? S 09:47 0:00 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
Code: Select all
[root@nagios ~]# service npcd status
NPCD running (pid 8026).
Code: Select all
[root@nagios ~]# ll /usr/local/nagios/share/perfdata/
total 24
drwxrwxr-x 2 nagios nagios 4096 May 9 17:03 Aethena
drwxrwxr-x 2 nagios nagios 4096 May 9 17:04 Comprehensive
drwxrwxr-x 2 nagios nagios 4096 May 9 17:05 localhost
drwxrwxr-x 2 nagios nagios 4096 May 9 17:05 pfSense
drwxrwxr-x 2 nagios nagios 4096 May 9 17:05 STG
drwxrwxr-x 2 nagios nagios 4096 May 9 17:03 Synergy
Code: Select all
[root@nagios ~]# ll /usr/local/nagios/var/spool/perfdata/ | wc -l
2
Code: Select all
[root@nagios ~]# ll /usr/local/nagios/var/spool/perfdata/
total 4
-rw-rw-r-- 1 nagios nagios 520 Apr 17 2012 1334648935.perfdata.host-PID-5427
Code: Select all
[root@nagios ~]# ll -d /usr/local/nagios/var/spool/perfdata/
drwxr-xr-x 2 nagios nagios 2850816 May 9 17:05 /usr/local/nagios/var/spool/perfdata/
Thank you for helping.
AWS
Posts: 63 Joined: Fri May 13, 2011 4:33 pm
Location: Vancouver, WA
Post
by AWS » Tue Sep 10, 2013 1:59 pm
abrist wrote: First, lets check the relevant logs:
Code: Select all
tail -25 /usr/local/nagios/var/perfdata.log
Post the output in code wraps.
Code: Select all
[root@nagios ~]# tail -25 /usr/local/nagios/var/perfdata.log
2012-06-02 06:46:23 [24532] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-06-02 06:46:23 [24532] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-06-02 06:46:23 [24532] [0] *** TIMEOUT: Please check your npcd.cfg
2012-06-02 06:46:23 [24532] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1338644339.perfdata.service-PID-24532 deleted
2012-06-02 06:46:23 [24532] [0] *** Timeout while processing Host: "SieTG" Service: "Server_Work_Queues"
2012-06-02 06:46:23 [24532] [0] *** process_perfdata.pl terminated on signal ALRM
2012-10-30 22:05:59 [3629] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-10-30 22:05:59 [3629] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-30 22:05:59 [3629] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-30 22:05:59 [3629] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351659949.perfdata.host-PID-3629 deleted
2012-10-30 22:05:59 [3629] [0] *** Timeout while processing Host: "pfsense.sietg.local" Service: ""
2012-10-30 22:05:59 [3629] [0] *** process_perfdata.pl terminated on signal ALRM
Code: Select all
[root@nagios ~]# tail -25 /usr/local/nagios/var/npcd.log
[09-10-2013 11:58:14] NPCD: ThreadCounter 0/5 File is .
[09-10-2013 11:58:14] NPCD: DEBUG: load 1.390000/30.000000
[09-10-2013 11:58:14] NPCD: ThreadCounter 0/5 File is ..
[09-10-2013 11:58:14] NPCD: DEBUG: load 1.390000/30.000000
[09-10-2013 11:58:14] NPCD: ThreadCounter 0/5 File is 1334648935.perfdata.host-PID-5427
[09-10-2013 11:58:14] NPCD: File '1334648935.perfdata.host-PID-5427' is an already in process PNP file. Leaving it untouched.
[09-10-2013 11:58:14] NPCD: No more files to process... waiting for 15 seconds
[09-10-2013 11:58:29] NPCD: Found 3 files in /usr/local/nagios/var/spool/perfdata/
[09-10-2013 11:58:29] NPCD: DEBUG: load 1.280000/30.000000
[09-10-2013 11:58:29] NPCD: ThreadCounter 0/5 File is .
[09-10-2013 11:58:29] NPCD: DEBUG: load 1.280000/30.000000
[09-10-2013 11:58:29] NPCD: ThreadCounter 0/5 File is ..
[09-10-2013 11:58:29] NPCD: DEBUG: load 1.280000/30.000000
[09-10-2013 11:58:29] NPCD: ThreadCounter 0/5 File is 1334648935.perfdata.host-PID-5427
[09-10-2013 11:58:29] NPCD: File '1334648935.perfdata.host-PID-5427' is an already in process PNP file. Leaving it untouched.
[09-10-2013 11:58:29] NPCD: No more files to process... waiting for 15 seconds
[09-10-2013 11:58:45] NPCD: Found 3 files in /usr/local/nagios/var/spool/perfdata/
[09-10-2013 11:58:45] NPCD: DEBUG: load 2.330000/30.000000
[09-10-2013 11:58:45] NPCD: ThreadCounter 0/5 File is .
[09-10-2013 11:58:45] NPCD: DEBUG: load 2.330000/30.000000
[09-10-2013 11:58:45] NPCD: ThreadCounter 0/5 File is ..
[09-10-2013 11:58:45] NPCD: DEBUG: load 2.330000/30.000000
[09-10-2013 11:58:45] NPCD: ThreadCounter 0/5 File is 1334648935.perfdata.host-PID-5427
[09-10-2013 11:58:45] NPCD: File '1334648935.perfdata.host-PID-5427' is an already in process PNP file. Leaving it untouched.
[09-10-2013 11:58:45] NPCD: No more files to process... waiting for 15 seconds
Thank you, too, for your help.
abrist
Red Shirt
Posts: 8334 Joined: Thu Nov 15, 2012 1:20 pm
Post
by abrist » Tue Sep 10, 2013 2:12 pm
Do the graph explorer graphs still work?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the
Dark Side .
AWS
Posts: 63 Joined: Fri May 13, 2011 4:33 pm
Location: Vancouver, WA
Post
by AWS » Tue Sep 10, 2013 2:23 pm
abrist wrote: Do the graph explorer graphs still work?
The graphs load w/o errors shown on screen. The first three graphs,
Top Alerts Last 24hrs ,
Host Health and
Service Health all show data for the current time/date. The last two graphs show data if I am able to set the date filter to anything before 05/09/2013 (MM/DD/YYYY). But still no errors are shown on screen.
Thx.
abrist
Red Shirt
Posts: 8334 Joined: Thu Nov 15, 2012 1:20 pm
Post
by abrist » Tue Sep 10, 2013 2:32 pm
Lets increase the logging level, wait 15 minutes, and then recheck the logs:
Edit the file:
Code: Select all
/usr/local/nagios/etc/pnp/process_perfdata.cfg
Change:
To:
Edit:
Code: Select all
/usr/local/nagios/etc/pnp/npcd.cfg
Change:
To:
Restart npcd:
Wait 15 minutes, and then recheck the logs:
Code: Select all
tail -25 /usr/local/nagios/var/perfdata.log
tail -25 /usr/local/nagios/var/npcd.log
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the
Dark Side .
AWS
Posts: 63 Joined: Fri May 13, 2011 4:33 pm
Location: Vancouver, WA
Post
by AWS » Tue Sep 10, 2013 2:41 pm
Ok, made the changes. Waiting 15 minutes then I'll edit this post with the logs.
Here is the result of restarting the npcd service:
Code: Select all
[root@nagios ~]# service npcd restart
NPCD Stopped.
DEBUG: Config File = /usr/local/nagios/etc/pnp/npcd.cfg
CONFIG_OPT_LOGTYPE = file
CONFIG_OPT_LOGFILE = /usr/local/nagios/var/npcd.log
CONFIG_OPT_LOGFILESIZE = 10485760
CONFIG_OPT_LOGLEVEL = -1
CONFIG_OPT_SCANDIR = /usr/local/nagios/var/spool/perfdata/
CONFIG_OPT_RUNCMD = /usr/local/nagios/libexec/process_perfdata.pl
CONFIG_OPT_RUNCMD_ARG = -b
CONFIG_OPT_MAXTHREADS = 5
CONFIG_OPT_LOAD = 30.0
CONFIG_OPT_USER = nagios
CONFIG_OPT_GROUP = nagios
CONFIG_OPT_PIDFILE = /usr/local/nagiosxi/var/subsys/npcd.pid
CONFIG_OPT_SLEEPTIME = 15
CONFIG_OPT_IDENTMYSELF = (null)
---------------------------
DEBUG: load_threshold is enabled - ('30.000000')
NPCD started.
UPDATE:
Logs:
Code: Select all
[root@nagios ~]# tail -25 /usr/local/nagios/var/perfdata.log
2012-06-02 06:46:23 [24532] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-06-02 06:46:23 [24532] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-06-02 06:46:23 [24532] [0] *** TIMEOUT: Please check your npcd.cfg
2012-06-02 06:46:23 [24532] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1338644339.perfdata.service-PID-24532 deleted
2012-06-02 06:46:23 [24532] [0] *** Timeout while processing Host: "SieTG" Service: "Server_Work_Queues"
2012-06-02 06:46:23 [24532] [0] *** process_perfdata.pl terminated on signal ALRM
2012-10-30 22:05:59 [3629] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-10-30 22:05:59 [3629] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-30 22:05:59 [3629] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-30 22:05:59 [3629] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351659949.perfdata.host-PID-3629 deleted
2012-10-30 22:05:59 [3629] [0] *** Timeout while processing Host: "pfsense.sietg.local" Service: ""
2012-10-30 22:05:59 [3629] [0] *** process_perfdata.pl terminated on signal ALRM
Code: Select all
[root@nagios ~]# tail -25 /usr/local/nagios/var/npcd.log
[09-10-2013 13:02:48] NPCD: ThreadCounter 0/5 File is .
[09-10-2013 13:02:48] NPCD: DEBUG: load 2.370000/30.000000
[09-10-2013 13:02:48] NPCD: ThreadCounter 0/5 File is ..
[09-10-2013 13:02:48] NPCD: DEBUG: load 2.370000/30.000000
[09-10-2013 13:02:48] NPCD: ThreadCounter 0/5 File is 1334648935.perfdata.host-PID-5427
[09-10-2013 13:02:48] NPCD: File '1334648935.perfdata.host-PID-5427' is an already in process PNP file. Leaving it untouched.
[09-10-2013 13:02:48] NPCD: No more files to process... waiting for 15 seconds
[09-10-2013 13:03:03] NPCD: Found 3 files in /usr/local/nagios/var/spool/perfdata/
[09-10-2013 13:03:03] NPCD: DEBUG: load 2.050000/30.000000
[09-10-2013 13:03:03] NPCD: ThreadCounter 0/5 File is .
[09-10-2013 13:03:03] NPCD: DEBUG: load 2.050000/30.000000
[09-10-2013 13:03:03] NPCD: ThreadCounter 0/5 File is ..
[09-10-2013 13:03:03] NPCD: DEBUG: load 2.050000/30.000000
[09-10-2013 13:03:03] NPCD: ThreadCounter 0/5 File is 1334648935.perfdata.host-PID-5427
[09-10-2013 13:03:03] NPCD: File '1334648935.perfdata.host-PID-5427' is an already in process PNP file. Leaving it untouched.
[09-10-2013 13:03:03] NPCD: No more files to process... waiting for 15 seconds
[09-10-2013 13:03:19] NPCD: Found 3 files in /usr/local/nagios/var/spool/perfdata/
[09-10-2013 13:03:19] NPCD: DEBUG: load 2.780000/30.000000
[09-10-2013 13:03:19] NPCD: ThreadCounter 0/5 File is .
[09-10-2013 13:03:19] NPCD: DEBUG: load 2.780000/30.000000
[09-10-2013 13:03:19] NPCD: ThreadCounter 0/5 File is ..
[09-10-2013 13:03:19] NPCD: DEBUG: load 2.780000/30.000000
[09-10-2013 13:03:19] NPCD: ThreadCounter 0/5 File is 1334648935.perfdata.host-PID-5427
[09-10-2013 13:03:19] NPCD: File '1334648935.perfdata.host-PID-5427' is an already in process PNP file. Leaving it untouched.
[09-10-2013 13:03:19] NPCD: No more files to process... waiting for 15 seconds
Last edited by
AWS on Tue Sep 10, 2013 3:04 pm, edited 1 time in total.
abrist
Red Shirt
Posts: 8334 Joined: Thu Nov 15, 2012 1:20 pm
Post
by abrist » Tue Sep 10, 2013 3:03 pm
Great, let us know!
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the
Dark Side .