Issues after Nagios upgrade to NagiosXi 2011

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
griffithusg
Posts: 64
Joined: Sun Nov 07, 2010 7:16 pm

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by griffithusg »

Hello,
I have applied the latest patch and although it is no where near as slow as what it was, some panes are still not showing the correct data after time. The initial refresh is much faster, but still after some hours not all the data is visible in Xi, but visible in Nagios3.2.

Any ideas ?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by mguthrie »

Can you run the following tool and post the output from it?

Code: Select all

/usr/local/nagios/bin/nagiostats
rdedon
Posts: 578
Joined: Sat Nov 20, 2010 4:51 pm

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by rdedon »

Hmm, what types of services are the ones that seem problematic? I am trying to find a common thread here between those that are not working. Could you post any information (minus any private company info of course) to help us try and tie some things together?
Rene deDon
Technical Team
___
Nagios Enterprises, LLC
Web: http://www.nagios.com
griffithusg
Posts: 64
Joined: Sun Nov 07, 2010 7:16 pm

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by griffithusg »

HI All,

Here is the output from the nagios Stats cmd.
root@phobos (na)(v)# /opt/nagios/bin/nagiostats

Nagios Stats 3.2.3
Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org)
Last Modified: 10-03-2010
License: GPL

CURRENT STATUS DATA
------------------------------------------------------
Status File: /usr/local/nagios/var/status.dat
Status File Age: 0d 0h 0m 2s
Status File Version: 3.2.3

Program Running Time: 0d 6h 30m 24s
Nagios PID: 20045
Used/High/Total Command Buffers: 0 / 0 / 4096

Total Services: 2780
Services Checked: 2780
Services Scheduled: 2780
Services Actively Checked: 2780
Services Passively Checked: 0
Total Service State Change: 0.000 / 59.470 / 0.789 %
Active Service Latency: 0.000 / 2.397 / 0.193 sec
Active Service Execution Time: 0.000 / 46.241 / 0.167 sec
Active Service State Change: 0.000 / 59.470 / 0.789 %
Active Services Last 1/5/15/60 min: 146 / 933 / 1827 / 2742
Passive Service Latency: 0.000 / 0.000 / 0.000 sec
Passive Service State Change: 0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit: 2667 / 33 / 44 / 36
Services Flapping: 33
Services In Downtime: 0

Total Hosts: 533
Hosts Checked: 532
Hosts Scheduled: 533
Hosts Actively Checked: 533
Host Passively Checked: 0
Total Host State Change: 0.000 / 10.660 / 0.028 %
Active Host Latency: 0.000 / 1.926 / 0.400 sec
Active Host Execution Time: 0.000 / 18.022 / 2.070 sec
Active Host State Change: 0.000 / 10.660 / 0.028 %
Active Hosts Last 1/5/15/60 min: 1 / 74 / 351 / 530
Passive Host Latency: 0.000 / 0.000 / 0.000 sec
Passive Host State Change: 0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0
Hosts Up/Down/Unreach: 529 / 4 / 0
Hosts Flapping: 0
Hosts In Downtime: 0

Active Host Checks Last 1/5/15 min: 15 / 132 / 548
Scheduled: 8 / 76 / 364
On-demand: 7 / 56 / 184
Parallel: 9 / 79 / 375
Serial: 0 / 0 / 0
Cached: 6 / 53 / 174
Passive Host Checks Last 1/5/15 min: 0 / 0 / 0
Active Service Checks Last 1/5/15 min: 52 / 254 / 756
Scheduled: 52 / 254 / 756
On-demand: 0 / 0 / 0
Cached: 0 / 0 / 0
Passive Service Checks Last 1/5/15 min: 0 / 0 / 0

External Commands Last 1/5/15 min: 0 / 0 / 0

I am gathering more information in regards to specific checks. Early investigation shows that it is a bit random.
griffithusg
Posts: 64
Joined: Sun Nov 07, 2010 7:16 pm

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by griffithusg »

Another interesting thing i found was when using the Config wizards, I was not able to see any data relating to Hostgroup configurations.

I have noticed this in my nagios.log file.
[1300171457] ndomod: Error writing to data sink! Some output may get lost...
[1300171457] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171473] ndomod: Successfully reconnected to data sink! 0 items lost, 584 queued items to flush.
[1300171473] ndomod: Error writing to data sink! Some output may get lost. 38 queued items to flush.
[1300171489] ndomod: Successfully reconnected to data sink! 0 items lost, 937 queued items to flush.
[1300171489] ndomod: Error writing to data sink! Some output may get lost. 380 queued items to flush.
[1300171505] ndomod: Successfully reconnected to data sink! 0 items lost, 836 queued items to flush.
[1300171505] ndomod: Error writing to data sink! Some output may get lost. 314 queued items to flush.
[1300171521] ndomod: Successfully reconnected to data sink! 0 items lost, 1479 queued items to flush.
[1300171521] ndomod: Error writing to data sink! Some output may get lost. 969 queued items to flush.
[1300171546] ndomod: Successfully reconnected to data sink! 0 items lost, 1239 queued items to flush.
[1300171577] ndomod: Error writing to data sink! Some output may get lost. 683 queued items to flush.
[1300171577] ndomod: Successfully reconnected to data sink! 0 items lost, 683 queued items to flush.
[1300171578] ndomod: Error writing to data sink! Some output may get lost. 187 queued items to flush.
[1300171594] ndomod: Successfully reconnected to data sink! 0 items lost, 930 queued items to flush.
[1300171594] ndomod: Successfully flushed 930 queued items to data sink.
[1300171596] ndomod: Error writing to data sink! Some output may get lost...
[1300171596] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171612] ndomod: Successfully reconnected to data sink! 0 items lost, 640 queued items to flush.
[1300171612] ndomod: Successfully flushed 640 queued items to data sink.
[1300171655] ndomod: Error writing to data sink! Some output may get lost...
[1300171655] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171671] ndomod: Successfully reconnected to data sink! 0 items lost, 217 queued items to flush.
[1300171671] ndomod: Successfully flushed 217 queued items to data sink.
[1300171671] ndomod: Error writing to data sink! Some output may get lost...
[1300171671] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171687] ndomod: Successfully reconnected to data sink! 0 items lost, 181 queued items to flush.
[1300171687] ndomod: Successfully flushed 181 queued items to data sink.
[1300171687] ndomod: Error writing to data sink! Some output may get lost...
[1300171687] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171703] ndomod: Successfully reconnected to data sink! 0 items lost, 195 queued items to flush.
[1300171703] ndomod: Successfully flushed 195 queued items to data sink.
[1300171705] ndomod: Error writing to data sink! Some output may get lost...
[1300171705] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171721] ndomod: Successfully reconnected to data sink! 0 items lost, 232 queued items to flush.
[1300171721] ndomod: Successfully flushed 232 queued items to data sink.
[1300171721] ndomod: Error writing to data sink! Some output may get lost...
[1300171721] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171737] ndomod: Successfully reconnected to data sink! 0 items lost, 291 queued items to flush.
[1300171737] ndomod: Successfully flushed 291 queued items to data sink.
[1300171737] ndomod: Error writing to data sink! Some output may get lost...
[1300171737] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171753] ndomod: Successfully reconnected to data sink! 0 items lost, 198 queued items to flush.
[1300171753] ndomod: Successfully flushed 198 queued items to data sink.
[1300171753] ndomod: Error writing to data sink! Some output may get lost...
[1300171753] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1300171769] ndomod: Successfully reconnected to data sink! 0 items lost, 851 queued items to flush.
[1300171769] ndomod: Error writing to data sink! Some output may get lost. 299 queued items to flush.

The only thing is I do believe I was seeing this before I did the upgrade. Have you seen this before?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by mguthrie »

You might have a broken or corrupted mysql table. Can you try the following script and see if the situation improves?

/usr/local/nagiosxi/scripts/repairmysql.sh nagios *


I think those errors in your log are definitely the key, I'm just not sure yet where the issue is coming from. Your nagiostats look good, nothing indicating overloading your system at all, so the problem must be something related to ndoutils.
griffithusg
Posts: 64
Joined: Sun Nov 07, 2010 7:16 pm

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by griffithusg »

Hi, I ran the command and it did seem to make a difference. but I am still not seeing the hostgroup information when I am using the config wizards. Also when i did a new import of configs from my old system I had the same problem again with it taking a long time to refresh.

I have rolled back my system to Xi 2009 1.4

I know I should have put this in my first post. but my system specs are
RHEL 5 - 32bit Vmware Guest with 8gb of memory
DNX with two worker nodes also 32bit.

Thanks :)
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by mguthrie »

Hi, I ran the command and it did seem to make a difference. but I am still not seeing the hostgroup information when I am using the config wizards.
Is with any particular wizard, or all of them?
When you run the wizard, can you also run a
tail -f /var/log/httpd/error_log
and watch for errors when that page loads up? (I know you'll probably get several notices and warnings, but send us anything that looks like it might be relevant).
Also when i did a new import of configs from my old system I had the same problem again with it taking a long time to refresh.
Are you referring to the data in the XI interface? Could you elaborate on this a bit more? I'm not quite following what this is referring to, and I want to make sure I understand the issue.

How is your nagios.log looking after the repair script for the mysql tables? Have the mass error messages for ndoutils gone away?
griffithusg
Posts: 64
Joined: Sun Nov 07, 2010 7:16 pm

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by griffithusg »

Hello,
Sorry for not replying for some time, but I have been busy solving problems. I have resolved this problem.
We have several older oracle databases in our organisation and in order to check them using nagios an Oracle 9i client had to be installed. This overwrote the existing version of gcc to version 3.4 or something old. Although no compilation errors occured when upgrading this seemed to have a large impact with NDOUtils not working properly. ofcourse because it was not Yum that installed the older version of gcc it still thought that it was at the latest version.

I know this would not occur with most people, but could I suggest some version checks to be performed on compilers as part of the pre-reqs?

Thanks again for your assistance in the matter :)

Rob
rdedon
Posts: 578
Joined: Sat Nov 20, 2010 4:51 pm

Re: Issues after Nagios upgrade to NagiosXi 2011

Post by rdedon »

Hello Rob,
thanks for the update and we are glad you figured that out. That is an excellent suggestion as information such as this helps us to develop our products. :-)
Rene deDon
Technical Team
___
Nagios Enterprises, LLC
Web: http://www.nagios.com
Locked