Page 1 of 1

NagiosXI 5.5.5 intermittent issues

Posted: Tue Oct 16, 2018 11:56 pm
by vishfx
Hi Team,

NagiosXI 5.5.5 ( installed from nagiosxi repo)
OS: RHEL 7.5

We have 1 nagiosxi webnode running gearmand and 2 mod_gearman workers.
However, we are seeing intermittent issues with NagiosXI unable to retrieve results.

Need immediate assistance in resolving this problem, as this is impacting delivery.

(NOTE: I am unable to attach the screenshot of the dashboard ,as I dont seem to have attachments rights, I have dropped a mail to NagiosXI sales/support team to enable attachment rights)


Below are some additional queries :

Why check_results queue has only 1 worker ? Is this a problem ?
How do I check job distribution among workers in realtime ?


2018-10-17 00:47:37 - localhost:4730 - v0.33

Queue Name | Worker Available | Jobs Waiting | Jobs Running
----------------------------------------------------------------------
check_results | 1 | 0 | 0
eventhandler | 200 | 0 | 0
host | 200 | 0 | 0
service | 200 | 0 | 0
worker_cpsnxiwtst01 | 1 | 0 | 0
worker_cpsnxiwtst02 | 1 | 0 | 0
----------------------------------------------------------------------


2018-10-17 00:46:19 - localhost:4730 - v0.33

Queue Name | Worker Available | Jobs Waiting | Jobs Running
----------------------------------------------------------------------
eventhandler | 200 | 0 | 0
host | 200 | 0 | 0
service | 200 | 0 | 0
worker_cpsnxiwprd01 | 1 | 0 | 0
worker_cpsnxiwprd02 | 1 | 0 | 0
----------------------------------------------------------------------

Re: NagiosXI 5.5.5 intermittent issues

Posted: Wed Oct 17, 2018 12:51 pm
by cdienger
Run "/usr/local/nagios/bin/nagios --version" to verify the core version. If it's 4.4.x, you'll need to downgrade it to 4.2.4:

https://support.nagios.com/kb/article/n ... e-823.html

If the version is corrected -

Can you clarify what you mean by "issues with NagiosXI unable to retrieve results." ?

The number of result_workers usually isn't a problem but you can increase this option in the /etc/mod_gearman2/module.conf on the server.

gearman_top2 is used to display real time job distribution. If you need more detail regarding exactly which jobs are running then there is the debug option found in the worker.conf
and module.conf files that will log additional data to text files.

The order that workers and the server are started is important. Follow the stop and start directions found in https://assets.nagios.com/downloads/nag ... ios_XI.pdf to restart them properly.

Re: NagiosXI 5.5.5 intermittent issues

Posted: Mon Oct 29, 2018 10:41 pm
by vishfx
Will be observing worker load and will re-open if I run into this again.

This ticket can be closed.

Thanks for the support.

Regards,
Vish.