Not geting any perfdata
Not geting any perfdata
After upgrading to 2014R1.2 on our rarely used test system I noticed that no graphs were getting updated. Basic troubleshooting (looking at /usr/local/nagios/var/perfdata.log) showed that actually, I haven't been getting any new info since Apr 28th. Confirmed with the contents of /usr/local/nagios/share/perfdata being last updated 4/28 also. Now, I only log on to this to try upgrades and validate new plugins etc.. So I have no idea what could have happened to it months ago - and why no perfdata is being collected. The last records in perfdata.log indicated TIMEOUTS. So I followed the guide and changed /usr/local/nagios/etc/pnp/process_perfdata.cfg tp TIMEOUT = 20 and changed the loglevel in npcd.cfg/process_perfdata.cfg to debug.
I've been looking for basic perfdata troubleshooting but it seems to be a rabbit hole. This was originally the vSphere OVF Template (64-bit). It is currently (just did the upgrade to 2014R1.2 yesterday) on CentOS release 6.5 (Final).
Any suggestions are greatly appreciated.
I've been looking for basic perfdata troubleshooting but it seems to be a rabbit hole. This was originally the vSphere OVF Template (64-bit). It is currently (just did the upgrade to 2014R1.2 yesterday) on CentOS release 6.5 (Final).
Any suggestions are greatly appreciated.
Re: Not geting any perfdata
Can you enable debug output as per the faq, wait a bit, and then post any errors in the logs perfdata.log and npcd.log:
http://support.nagios.com/wiki/index.ph ... leshooting
http://support.nagios.com/wiki/index.ph ... leshooting
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Not geting any perfdata
Increase Performance Data Logging Verbosity.
[root@ust-nagios1t var]# cat /usr/local/nagios/etc/pnp/process_perfdata.cfg |grep LOG_LEVEL
LOG_LEVEL = 2
File /usr/local/nagios/var/perfdata.log did not exits, restarted server, still not there. Manually touched file and change permissions, restarted again. No data coming into it.
[root@ust-nagios1t var]# ls -l /usr/local/nagios/var/perfdata.log
-rw-r--r-- 1 nagios nagios 0 Jul 7 12:19 /usr/local/nagios/var/perfdata.log
Increase NPCD Logging Verbosity.
[root@ust-nagios1t var]# cat /usr/local/nagios/etc/pnp/npcd.cfg |grep log_level
# log_level - how much should we log?
# log_level = <integer value>
log_level = -1
(see attached npcd logs)
Perfdata Timeout section - no log, nothing going into it.. nothing to attach
Changed timeout value (was 20 to 30 for the fun of it - restarted, still no logs)
[root@ust-nagios1t var]# cat /usr/local/nagios/etc/pnp/process_perfdata.cfg |grep TIMEOUT
TIMEOUT = 30
NPCD Load Threshold - checked this/ upped that also
[root@ust-nagios1t var]# cat /usr/local/nagios/etc/pnp/npcd.cfg |grep load_thre
# use_load_threshold - enables/disables load watching
# use_load_threshold = <0 / 1> (default: 0)
#use_load_threshold = 0
# load_threshold - npcd won't start new threads
# load_threshold = <float value> (default: 10.0)
load_threshold = 20.0
restarted
ran file permission reset for the fun of it.
/usr/local/nagiosxi/scripts/reset_config_perms
restarted.
Run the command manually
[root@ust-nagios1t libexec]# ./check_rrdtraf -f '/var/lib/mrtg/140.209.1.1_151126071.rrd' -w 1 -c 2
OK - Current BW in: 0bps Out: 0bps|in=0b/s;1;2 out=0b/s;1;2
[root@ust-nagios1t libexec]# ls -l /var/lib/mrtg/140.209.1.1_151126071.rrd
-rw-r--r-- 1 root root 105312 Jul 7 12:40 /var/lib/mrtg/140.209.1.1_151126071.rrd
these files seem to be updating/accurate to some extent.
[root@ust-nagios1t var]# cat /usr/local/nagios/etc/pnp/process_perfdata.cfg |grep LOG_LEVEL
LOG_LEVEL = 2
File /usr/local/nagios/var/perfdata.log did not exits, restarted server, still not there. Manually touched file and change permissions, restarted again. No data coming into it.
[root@ust-nagios1t var]# ls -l /usr/local/nagios/var/perfdata.log
-rw-r--r-- 1 nagios nagios 0 Jul 7 12:19 /usr/local/nagios/var/perfdata.log
Increase NPCD Logging Verbosity.
[root@ust-nagios1t var]# cat /usr/local/nagios/etc/pnp/npcd.cfg |grep log_level
# log_level - how much should we log?
# log_level = <integer value>
log_level = -1
(see attached npcd logs)
Perfdata Timeout section - no log, nothing going into it.. nothing to attach
Changed timeout value (was 20 to 30 for the fun of it - restarted, still no logs)
[root@ust-nagios1t var]# cat /usr/local/nagios/etc/pnp/process_perfdata.cfg |grep TIMEOUT
TIMEOUT = 30
NPCD Load Threshold - checked this/ upped that also
[root@ust-nagios1t var]# cat /usr/local/nagios/etc/pnp/npcd.cfg |grep load_thre
# use_load_threshold - enables/disables load watching
# use_load_threshold = <0 / 1> (default: 0)
#use_load_threshold = 0
# load_threshold - npcd won't start new threads
# load_threshold = <float value> (default: 10.0)
load_threshold = 20.0
restarted
ran file permission reset for the fun of it.
/usr/local/nagiosxi/scripts/reset_config_perms
restarted.
Run the command manually
[root@ust-nagios1t libexec]# ./check_rrdtraf -f '/var/lib/mrtg/140.209.1.1_151126071.rrd' -w 1 -c 2
OK - Current BW in: 0bps Out: 0bps|in=0b/s;1;2 out=0b/s;1;2
[root@ust-nagios1t libexec]# ls -l /var/lib/mrtg/140.209.1.1_151126071.rrd
-rw-r--r-- 1 root root 105312 Jul 7 12:40 /var/lib/mrtg/140.209.1.1_151126071.rrd
these files seem to be updating/accurate to some extent.
Re: Not geting any perfdata
Also noticing that
[root@ust-nagios1t var]# ls -l /usr/local/nagios/share/perfdata/
doesn't have any data since Apr 28th... No files have changed. On production they work - so I'm assuming I broke something etc... back then. Not sure what the heck would have happened.. .. But another clue.
[root@ust-nagios1t var]# ls -l /usr/local/nagios/share/perfdata/
doesn't have any data since Apr 28th... No files have changed. On production they work - so I'm assuming I broke something etc... back then. Not sure what the heck would have happened.. .. But another clue.
Re: Not geting any perfdata
Did you restart npcd after modifying the log level?
Also, run the following commands and show us the output:
Code: Select all
service npcd restartCode: Select all
ls /usr/local/nagios/var/spool/xidpe | wc -l
ls /usr/local/nagios/var/spool/perfdata | wc -l
ls /usr/local/nagios/var/spool/checkresults | wc -lBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: Not geting any perfdata
[root@ust-nagios1t ~]# service npcd restart
NPCD Stopped.
DEBUG: Config File = /usr/local/nagios/etc/pnp/npcd.cfg
CONFIG_OPT_LOGTYPE = file
CONFIG_OPT_LOGFILE = /usr/local/nagios/var/npcd.log
CONFIG_OPT_LOGFILESIZE = 10485760
CONFIG_OPT_LOGLEVEL = -1
CONFIG_OPT_SCANDIR = /usr/local/nagios/var/spool/perfdata/
CONFIG_OPT_RUNCMD = /usr/local/nagios/libexec/process_perfdata.pl
CONFIG_OPT_RUNCMD_ARG = -b
CONFIG_OPT_MAXTHREADS = 5
CONFIG_OPT_LOAD = 20.0
CONFIG_OPT_USER = nagios
CONFIG_OPT_GROUP = nagios
CONFIG_OPT_PIDFILE = /usr/local/nagiosxi/var/subsys/npcd.pid
CONFIG_OPT_SLEEPTIME = 10
CONFIG_OPT_IDENTMYSELF = (null)
---------------------------
DEBUG: load_threshold is enabled - ('20.000000')
NPCD started.
[root@ust-nagios1t ~]# ls /usr/local/nagios/var/spool/xidpe | wc -l
795600
[root@ust-nagios1t ~]# ls /usr/local/nagios/var/spool/perfdata | wc -l
0
[root@ust-nagios1t ~]# ls /usr/local/nagios/var/spool/checkresults | wc -l
372
NPCD Stopped.
DEBUG: Config File = /usr/local/nagios/etc/pnp/npcd.cfg
CONFIG_OPT_LOGTYPE = file
CONFIG_OPT_LOGFILE = /usr/local/nagios/var/npcd.log
CONFIG_OPT_LOGFILESIZE = 10485760
CONFIG_OPT_LOGLEVEL = -1
CONFIG_OPT_SCANDIR = /usr/local/nagios/var/spool/perfdata/
CONFIG_OPT_RUNCMD = /usr/local/nagios/libexec/process_perfdata.pl
CONFIG_OPT_RUNCMD_ARG = -b
CONFIG_OPT_MAXTHREADS = 5
CONFIG_OPT_LOAD = 20.0
CONFIG_OPT_USER = nagios
CONFIG_OPT_GROUP = nagios
CONFIG_OPT_PIDFILE = /usr/local/nagiosxi/var/subsys/npcd.pid
CONFIG_OPT_SLEEPTIME = 10
CONFIG_OPT_IDENTMYSELF = (null)
---------------------------
DEBUG: load_threshold is enabled - ('20.000000')
NPCD started.
[root@ust-nagios1t ~]# ls /usr/local/nagios/var/spool/xidpe | wc -l
795600
[root@ust-nagios1t ~]# ls /usr/local/nagios/var/spool/perfdata | wc -l
0
[root@ust-nagios1t ~]# ls /usr/local/nagios/var/spool/checkresults | wc -l
372
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Not geting any perfdata
Is the biggest issue, you will need to remove those files from that directory to allow any new data to be processed, looks like it got clogged up. NPCD is going to sit at the highest load value you have given it until all of those files are crunched through, which will likely never happen as new data continues to come in.[root@ust-nagios1t ~]# ls /usr/local/nagios/var/spool/xidpe | wc -l
795600
The below commands should take care of the issue, let us know if graphs start to populate after about 10-20 minutes after this has been ran:
Code: Select all
service npcd stop
cd /usr/local/nagios/var/spool/xidpe
find . -type f -delete
service npcd startCode: Select all
ll /usr/local/nagios/var/spool/xidpe | wc -l
Code: Select all
tail -100 /usr/local/nagios/var/npcd.log
Re: Not geting any perfdata
Awesome. That seems to work. I'm going to let it run for a bit and see where it goes. THANKS!!!!!
Re: Not geting any perfdata
Looks good, thanks.