Page 2 of 5

Re: perfdata is not taking new data

Posted: Fri Dec 09, 2016 7:30 pm
by pmithil7
To 'tgriep':


I have emptied files host-perfdata and service-perfdata and restarted 'npcd' & 'nagios'. I see that the data has started to grow up in these two files while there has been still no changes in 'perfdata'.

No data is still seen in the graphs.

Re: perfdata is not taking new data

Posted: Fri Dec 09, 2016 7:35 pm
by pmithil7
Hi ssax,

I don't think so someone has upgraded pnp4nagios. Why do you think so? I have mentioned before in the thread what has caused this issue to occur which was no diskspace on the file system. After increasing the space to solve the nagios crash, i have stopped seeing data in the graphs since then.

Let me know your views on it.

Thank you.

-Mithil

Re: perfdata is not taking new data

Posted: Mon Dec 12, 2016 1:46 pm
by pmithil7
Current situation:

Data has again started accumulating in 'host-perfdata' and 'service-perfdata' while still no data has been updated in 'perfdata'. I have emptied the contents of these 2 files and restarted nagios & npcd. Also, still no data is being seen in graphs.

Any idea what else should i try? Any configuration changes to be done in the files previously attached?

Thanks,
Mithil

Re: perfdata is not taking new data

Posted: Mon Dec 12, 2016 2:00 pm
by tgriep
One thing to try is to delete the .xml and .rrd files for one of the hosts, clear out the host-perfdata and service-perfdata files and see if that host's graph starts to populate.
Also, can you run these commands as root and post the output?

Code: Select all

ls -l /usr/local/pnp4nagios/var/service-perfdata
ls -l /usr/local/pnp4nagios/var/spool/

Re: perfdata is not taking new data

Posted: Mon Dec 12, 2016 2:27 pm
by pmithil7
Here's the output:

[root@nagios-eqx-01 ~]# ls -l /usr/local/pnp4nagios/var/spool/
total 0
[root@nagios-eqx-01 ~]# ls -l /usr/local/pnp4nagios/var/service-perfdata
-rw-r--r-- 1 nagios nagios 11063371177 Dec 12 11:27 /usr/local/pnp4nagios/var/service-perfdata
[root@nagios-eqx-01 ~]#

I see nothing is going in the spool directory and that might be the issue.

Re: perfdata is not taking new data

Posted: Mon Dec 12, 2016 3:49 pm
by ssax
/usr/local/pnp4nagios is not the default directory, also XI only comes with PNP4Nagios 0.4 and you have # pnp4nagios–0.6.25 at the top of your process_perfdata.cfg.

Re: perfdata is not taking new data

Posted: Mon Dec 12, 2016 5:27 pm
by pmithil7
by tgriep » Mon Dec 12, 2016 11:00 am

One thing to try is to delete the .xml and .rrd files for one of the hosts, clear out the host-perfdata and service-perfdata files and see if that host's graph starts to populate.

-->>>>

I tried this on few hosts and the graphs that were there for these hosts without data have also gone away and no new .xml & .rrd files are being generated for this host which is expected as nothing new has been written in the perfdata directory since the day 'nagios' crashed out of space. Also, i see services are still showing up for that host in the 'service-perfdata' folder.

Re: perfdata is not taking new data

Posted: Mon Dec 12, 2016 5:42 pm
by tgriep
Lets disable livestatus and see if that helps.
Edit your nagios.cfg file and comment out this line by changing it from

Code: Select all

broker_module=/usr/lib/check_mk/livestatus.o /var/log/nagios/rw/live
to

Code: Select all

#broker_module=/usr/lib/check_mk/livestatus.o /var/log/nagios/rw/live
Save it out.

Run this to clear out the host and service perfdata

Code: Select all

> /usr/local/pnp4nagios/var/service-perfdata
> /usr/local/pnp4nagios/var/host-perfdata
Restart nagios by running

Code: Select all

service nagios restart
If the graphs still fail, post the following files

Code: Select all

/usr/local/pnp4nagios/var/perfdata.log
/usr/local/pnp4nagios/var/npcd.log
Run these as root and post the output.

Code: Select all

which rrdtool
rrdtool -v

Re: perfdata is not taking new data

Posted: Mon Dec 12, 2016 8:27 pm
by pmithil7
Commenting out livestatus actually couldn't get the Nagios/Check_MK GUI to come up. It was faiing there only so no way to look at graphs. I removed the comment and restarted and the GUI came back up so that wouldn't work.

As requested, output for :

/usr/local/pnp4nagios/var/perfdata.log is as below. It is still full of logs from 2016-11-22 which is when the nagios server crashed and nothing after it.
'''
File truncated2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/haproxy-eqx-01/_HOST_.rrd: found extra data on update argument: 1.587:0.198
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/swb2f3-sjc-01/_HOST_.rrd 1479833153:2.505:0:5.384:1.747
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/swb2f3-sjc-01/_HOST_.rrd: found extra data on update argument: 5.384:1.747
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/petora11/_HOST_.rrd 1479833153:1.192:0:1.229:1.135
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/petora11/_HOST_.rrd: found extra data on update argument: 1.229:1.135
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/swpoeb2f4-sjc-01/_HOST_.rrd 1479833153:2.070:0:2.134:1.989
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/swpoeb2f4-sjc-01/_HOST_.rrd: found extra data on update argument: 2.134:1.989
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/vi-sjc-06/_HOST_.rrd 1479833153:2.435:0:3.172:2.226
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/vi-sjc-06/_HOST_.rrd: found extra data on update argument: 3.172:2.226
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/dc-sjc-01/_HOST_.rrd 1479833153:2.283:0:2.313:2.237
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/dc-sjc-01/_HOST_.rrd: found extra data on update argument: 2.313:2.237
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/dc-sjc-02/_HOST_.rrd 1479833153:2.384:0:2.739:2.240
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/dc-sjc-02/_HOST_.rrd: found extra data on update argument: 2.739:2.240
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/swconsole03/_HOST_.rrd 1479833153:2.235:0:2.976:2.020
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/swconsole03/_HOST_.rrd: found extra data on update argument: 2.976:2.020
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/vi-sjc-03/_HOST_.rrd 1479833153:2.253:0:2.399:2.186
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/vi-sjc-03/_HOST_.rrd: found extra data on update argument: 2.399:2.186
2016-11-22 08:46:04 [29619] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/swb2f2-sjc-01/_HOST_.rrd 1479833153:2.419:0:2.512:2.331
2016-11-22 08:46:04 [29619] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/swb2f2-sjc-01/_HOST_.rrd: found extra data on update argument: 2.512:2.331
'''

Also there is no npcd.log file in the whole server. I do have npcd.cfg but no log file of it.

Output for :
which rrdtool
rrdtool -v

is as below:
'''
[root@nagios-eqx-01 ~]# which rrdtool
/usr/bin/rrdtool
[root@nagios-eqx-01 ~]# rrdtool -v
RRDtool 1.4.7 Copyright 1997-2012 by Tobias Oetiker <[email protected]>
Compiled Apr 5 2012 17:36:08

Usage: rrdtool [options] command command_options
Valid commands: create, update, updatev, graph, graphv, dump, restore,
last, lastupdate, first, info, fetch, tune,
resize, xport, flushcached

RRDtool is distributed under the Terms of the GNU General
Public License Version 2. (www.gnu.org/copyleft/gpl.html)

For more information read the RRD manpages
'''

Re: perfdata is not taking new data

Posted: Tue Dec 13, 2016 11:41 am
by tgriep
Could you post the newest log entries from the /usr/local/pnp4nagios/var/perfdata.log file?
Could you post the following files so I can view them?

Code: Select all

/usr/local/pnp4nagios/var/perfdata/swb2f2-sjc-01/_HOST_.rrd
/usr/local/pnp4nagios/var/perfdata/swb2f2-sjc-01/_HOST_.xml
/usr/local/nagios/etc/commands.cfg
In the day the graphs stopped, were there any updates run on the server?