Our Nagios XI server is running into a really strange behavior since 10 days.
The messages queue are constantly high and do not decrease causing result check to appear far after. The delay response time is 7 minutes after the check command which is way too late for active monitoring.
We have ran several investigations on the server and on the database.
We are not running in any high CPU usage and the database is properly running.
We are running Nagios XI 5.5.2 with Nagios Core 4.2.4 along with 6 Mod Gearman 2 servers. Nagios server is running 15% of all checks and the rest being run by MG servers. Below are the packages we have installed on Nagios server :
We are running with 7,600 active hosts and 35,206 active services.nagiosxi-nrds-5.5.2-1.el6.x86_64
nagiosxi-5.5.2-1.el6.x86_64
nagiosxi-wkhtmltox-5.5.2-1.el6.x86_64
nagiosxi-nsca-5.5.2-1.el6.x86_64
nagiosxi-pnp-5.5.2-1.el6.x86_64
nagiosxi-shellinabox-5.5.2-1.el6.x86_64
nagiosxi-nxti-5.5.2-1.el6.x86_64
nagiosxi-nagioscore-5-4.13.el6.x86_64
nagiosxi-nrpe-5.5.2-1.el6.x86_64
nagiosxi-nagiosmobile-5.5.2-1.el6.x86_64
nagiosxi-mrtg-5.5.2-1.el6.x86_64
nagiosxi-nagvis-5.5.2-1.el6.x86_64
nagiosxi-wmic-5.5.2-1.el6.x86_64
nagiosxi-ndoutils-5.5.2-1.el6.x86_64
nagiosxi-nagiosplugins-5.5.2-1.el6.x86_64
gearmand-server-0.33-2.x86_64
gearmand-0.33-2.x86_64
gearmand-devel-0.33-2.x86_64
mod_gearman2-2.1.1-1.el6.x86_64
Our MySQL DB is ingesting 20,000 query per seconds.
IPCS queues stall above 500,000 messages.
We are running with MySQL version 5.1.73-8 on the same server as Nagios.
'mysqld.log' is not showing any errors.
We are aligned with all recommended performance configuration from Nagios documentation.
I've search the forum but I cannot find any similar issue, if any one has a clue
Thanks in advance for your help