nagios xi gets unusable after an apply configuration
Posted: Wed Oct 10, 2018 10:37 am
Hello Support,
from the last week we started to experience great problems on our Nagios XI platform, due to slowness problems.
We often reach high values of cpu usage (near 100%) but the major problems start when we do an 'Apply Configuration'. During this operation we noticed that the number of services in unknown state increases quickly, and this has never happened.
Passive checks are no more received (i suppose that during an 'Apply configuration' they are put in a queue and processed when the nagios process is up again) and this let the freshness parameter to be exceeded. Some active checks also goes in unknown state.
Sometimes, after the 'Apply configuration', the nagios process remains up for a while and then stops.
The 'Apply configuration' itself last a lot of time (more or less 2 minutes).
The great number of unknown services triggers a series of operations which lead Nagios XI to be unusable.
We noticed that the 'mysqld' process takes a lot of CPU.
Here some information:
* Nagios XI 2014R2.7 on CentOs 6.6
* 4 worker with gearmand 0.33
* ~15000 services on ~1500hosts
What we have done till now:
* check of the last inserted configurations: no issues have been detected
* services: there are no services running in a particularly long time
* check of mysql tables: no error or corrupted indexes have been found
* nagios log: increased the verbosity, no errors found
* nagios XI server and all worker restarted, but nothing changed
What could you suggest us?
Regards
Francesco
from the last week we started to experience great problems on our Nagios XI platform, due to slowness problems.
We often reach high values of cpu usage (near 100%) but the major problems start when we do an 'Apply Configuration'. During this operation we noticed that the number of services in unknown state increases quickly, and this has never happened.
Passive checks are no more received (i suppose that during an 'Apply configuration' they are put in a queue and processed when the nagios process is up again) and this let the freshness parameter to be exceeded. Some active checks also goes in unknown state.
Sometimes, after the 'Apply configuration', the nagios process remains up for a while and then stops.
The 'Apply configuration' itself last a lot of time (more or less 2 minutes).
The great number of unknown services triggers a series of operations which lead Nagios XI to be unusable.
We noticed that the 'mysqld' process takes a lot of CPU.
Here some information:
* Nagios XI 2014R2.7 on CentOs 6.6
* 4 worker with gearmand 0.33
* ~15000 services on ~1500hosts
What we have done till now:
* check of the last inserted configurations: no issues have been detected
* services: there are no services running in a particularly long time
* check of mysql tables: no error or corrupted indexes have been found
* nagios log: increased the verbosity, no errors found
* nagios XI server and all worker restarted, but nothing changed
What could you suggest us?
Regards
Francesco