Gearmand problems running on NagiosXI 5.11.3

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Post Reply
ScottMc
Posts: 28
Joined: Mon Aug 06, 2018 9:35 am

Gearmand problems running on NagiosXI 5.11.3

Post by ScottMc »

We're running gearmand v3 with Nagios XI 5.11.3 and over the weekend it hung up on us. Restarting the server and even restoring backups of Nagios XI to previous weeks still causes the same issue. Since gearmand is deprecated and we need to rebuild our Nagios server (it's currently on CentOS 7) we were planning on moving away from this anyway, but can't do so this suddenly (2000 hosts, 20k services, 5 different countries, etc.). While I know support for gearmand is beyond the scope of the forum, what I'd like is the documenation that Nagios used to have for configuring gearmand (the link just points to a PDF that basically says 'no'). I need to get this working in the short term so I can migrate this properly otherwise the suggestion will be (since we're starting over )to move another platform that plays better with of distributed environment. Thanks!
ScottMc
Posts: 28
Joined: Mon Aug 06, 2018 9:35 am

Re: Gearmand problems running on NagiosXI 5.11.3

Post by ScottMc »

Never mind. I was able to find the instructions online using wayback but Nagios still invariably crashes gearman. It's a shame because gearman was really the only viable way I could see running Nagios globally without a massive amount of additional management and redundancy configuration. Nagios embracing the remote worker model would've made it a game changer. :cry:
sgardil
Posts: 144
Joined: Wed Aug 09, 2023 9:58 am

Re: Gearmand problems running on NagiosXI 5.11.3

Post by sgardil »

Hey @ScottMc

Thanks for your input on this and I will make a note of this as a request. I can't give any confirmations if this is planned as things are constantly changing, however in the mean time if you find any workarounds for your issue feel free to share them.
harrisj5
Posts: 4
Joined: Thu Apr 11, 2024 5:10 pm

Re: Gearmand problems running on NagiosXI 5.11.3

Post by harrisj5 »

It could be that your master server does not have enough compute power. NagiosXi runs a mysql Database so it is going to require more compute power
than Nagios Core. You will see a lost orphaned check because they are in a waiting state in the geaman check queue. We have about 3300 servers and 26K checks running on ours. We I had the servers built, I had them built to the specs of our Nagios Core master server. However with the database backend the resources were not enough. The load average went up to 58 yesterday and I had about 7500 checks in the queue that showed as unknown and when you looked at the details they were all orphaned and showed as waiting in the queue in gearman_top. I doubled my cpu cores and memory to give extra resources. Thus far today I have seen no issues with the checks hanging and getting orphaned
Post Reply