scaling Nagios with mod_gearman

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
davis3792_2
Posts: 2
Joined: Mon Jun 08, 2020 9:29 am

scaling Nagios with mod_gearman

Post by davis3792_2 »

Hypothetical scenario that may become a reality as our footprint expands:

8000 end points/hosts
10 services per host to be monitored
NO passive monitors - ONLY active.
Assume very large XI server hardware is available and servers available for workers.

Questions:
1. Is mod_gearman designed to allow a single Nagios XI server to reliably scale to this level?
2. Has anyone successfully achieved this?
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: scaling Nagios with mod_gearman

Post by ssax »

I would choose the fastest storage speed (SSD the more IOPS the better) you can for the XI server/DB.

I think mod_gearman can probably handle it if you have enough external workers but it really depends on your hardware and lot of other factors (storage speed, plugin speed/size, which plugins you're using, etc). It's really almost impossible for us to know because the programs you're going to execute on it (the plugins) are an unknown.

XI prior to 5.7 would likely have trouble with that many but it really depends on what you do for mitigation.

Here's what I send to people who are approaching the need for mod_gearman:

Generally at 10K total combined host/service checks we recommend that you setup a RAMDisk, and at around 20K we recommend you start looking at adding an additional XI server (or gearman) because they can only process so much. Now this may come sooner or later than 20K depending on what type of checks you are running, how much resources they use, your hardware speed, and what you're doing to mitigate the impact.

You can read more about setting up a RAMDisk here:

https://assets.nagios.com/downloads/nag ... giosXI.pdf

You should run this check profiler script and see what long running checks you have and determine what some of your long running checks are, they consume resources the whole time they are running so reducing those helps a lot:

https://exchange.nagios.org/directory/P ... me/details

See here for gearman:

https://assets.nagios.com/downloads/nag ... ios_XI.pdf
https://support.nagios.com/kb/article.php?id=484

NOTE: Make sure that you follow the "Remote Worker Considerations" and the "Host groups and Service groups​" sections from the second link above and then follow the "Disable Worker​" section from the first link once you've setup your exclude groups.

Please read through this doc as well:

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

You can only do so much on a single server, eventually you'll hit a limit, but there are very many things that affects that number so it's almost impossible for us to guess. This is the best you can do:

https://support.nagios.com/kb/article/n ... g-523.html

Let me know if you have any questions or if I can clarify anything.
davis3792_2
Posts: 2
Joined: Mon Jun 08, 2020 9:29 am

Re: scaling Nagios with mod_gearman

Post by davis3792_2 »

Thanks for the reply.

But you didn't answer my question.

I'm trying to do a reality check to see if anyone has actually, successfully achieved large, horizontal scalability with mod_gearman. Words like "probably, possibly, should, likely, maybe, I think" are an immediate turn off. I'm looking for an absolute answer.

Do you know of any real world cases where customers have done this? If not, we aren't likely to take on the risk of trying to be the first.

Thanks
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: scaling Nagios with mod_gearman

Post by benjaminsmith »

Hi @davis3792_2,

That looks like an exciting project, and I would highly recommend reaching out to our sales team ([email protected]). They would be more than happy to offer recommendations on architecting these large installations as there are several different approaches you can implement depending on all of the requirements.

Regards,
Benjamin Smith
Technical Support Manager
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked