We are experiencing via the XI hosts and service detail GUIs in XI that as the system runs the last check times are progressively falling behind. If we go to Nagios Core directly (https://xxxx/nagios) these last check times seems to be current. If you let this run for a day or more the times can get hours behind, thus not showing current status.
Looking at the ipcs -q command it shows while doing a while loop that the msqid for nagios continues to grow. It decreases from time to time but overall it's in a growing state. Left run for a long period it gets over 100K messages.
Any help in trying to troubleshoot this or where should began looking would be helpful. The system is only monitoring approx 50 hosts with 700 service checks currently.
XI Last Check Times Falling Behind
Re: XI Last Check Times Falling Behind
What Nagios XI and Core versions are you running? The IPCS queue issue was resolved for the most part in XI 5.
Former Nagios employee
-
chipngc_nagios
- Posts: 14
- Joined: Fri Mar 20, 2015 2:46 pm
Re: XI Last Check Times Falling Behind
Using Nagios XI 2014R2.6, Core version is 4.0.8
Would there be anything we could do in the current version to help the situation without having to upgrade?
Would there be anything we could do in the current version to help the situation without having to upgrade?
Re: XI Last Check Times Falling Behind
You could take the attached zip file and put it into your XI /tmp directory and run these commands:
That will patch NDOUtils for you, recompile it, and install it for you.
Code: Select all
cd /tmp
unzip ndoutils_kmqpatch.zip
cd ndoutils_kmqpatch
./upgradeYou do not have the required permissions to view the files attached to this post.
-
chipngc_nagios
- Posts: 14
- Joined: Fri Mar 20, 2015 2:46 pm
Re: XI Last Check Times Falling Behind
While executing the update, when it does the ./configure it has an issue with the MySQL library could not be located, but the script continues on. However when complete the ndo2db is not running, and when look at the /etc/init.d/ndo2db status is states ndo2db is not running but subsystem is locked.
When trying to start ndo2db it states: Starting ndo2db:Support for the specified database server is either not yet supported, or was not found on your system.
When trying to start ndo2db it states: Starting ndo2db:Support for the specified database server is either not yet supported, or was not found on your system.
-
chipngc_nagios
- Posts: 14
- Joined: Fri Mar 20, 2015 2:46 pm
Re: XI Last Check Times Falling Behind
What I've discovered in the meantime was the host was a VM, which was having 50ms delayed storage write times which was creating the ndo2db not to be able to keep up and thus creating all this havoc. Once the storage was tuned for no latency vs. throughput then the iowait times went down and the application / databases were able to work as anticipated. You can close this issue based upon that find.
Re: XI Last Check Times Falling Behind
Glad to see you were able to resolve this latency issue! I will now close this thread, but feel free to open another if you ever need more assistance.
Former Nagios Employee