Physical Database Sizing Recommendations

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
[email protected]
Posts: 18
Joined: Mon Jul 06, 2020 10:21 pm
Location: Portland, ME

Physical Database Sizing Recommendations

Post by [email protected] »

Hello,
We are going to pursue a physical database for our Nagios XI installation. We anticipate we will over 1500 hosts and 6 to 8K services by 2Q2021. I have not been able to find any documentation which provides physical server specifications for memory, RAM, disk space in this upper threshold of objects. Please point me to some documentation or provide calculations to help us size the physical server we will be purchase.

Here is the best that I have been able to find.
More than 500 hosts, more than 2,500 services:
At least 120 GB disk space, more than 4 cores, and more than 8 GB RAM.

Thank you,
Deirdre
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Physical Database Sizing Recommendations

Post by ssax »

We don't have any sizing guidelines for the offloaded DB but a good start for that would be at least 120 GB disk space, 4-8 cores, and around 8-16GB of RAM. The faster the storage speed the better!
[email protected]
Posts: 18
Joined: Mon Jul 06, 2020 10:21 pm
Location: Portland, ME

Re: Physical Database Sizing Recommendations

Post by [email protected] »

Thanks so much -- confirms why we couldn't find any docs. Is offloading the database to a VM v. a physical server acceptable?
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Physical Database Sizing Recommendations

Post by ssax »

It can be a VM, a lot of customers run the DB in VMs, just make sure it has fast disks as that's one of the main factors in SQL database speed.

That size is still pretty small for an XI system (depending on your setup), is there any specific issue that you're running into that you're offloading the DB or are you just being proactive?
[email protected]
Posts: 18
Joined: Mon Jul 06, 2020 10:21 pm
Location: Portland, ME

Re: Physical Database Sizing Recommendations

Post by [email protected] »

Thanks for your reply. Sorry for delay in getting back to you. We are straight out with the productionalization of Nagios XI over the next few weeks :D

Yes, being proactive. Do have a question - you mention that 1500K hosts and 8000k services is relatively small. What host/service check count would be considered medium and and what would be considered large? What are some of the largest Nagios XI check counts and how are their systems set up? mostly separate db or all on the same box?

Thanks,
Deirdre
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Physical Database Sizing Recommendations

Post by ssax »

A single XI system without gearman 20K would be a large system, with gearman offloading the load you can get a lot more on there because the actual running of the checks would be performed by external workers alleviating the impact on the XI system allowing you to get more on there.

Here's what I send people who are approaching 10K total checks:

Generally at 10K total combined host/service checks we recommend that you setup a RAMDisk, and at around 20K we recommend you start looking at adding an additional XI server because they can only process so much. Now this may come sooner or later than 20K depending on what type of checks you are running, how much resources they use, your hardware speed, and what you're doing to mitigate the impact.

You can read more about setting up a RAMDisk here:

https://assets.nagios.com/downloads/nag ... giosXI.pdf

You should run this check profiler script and see what long running checks you have and determine what some of your long running checks are, they consume resources the whole time they are running so reducing those helps a lot:

https://exchange.nagios.org/directory/P ... me/details

The next step would be for you to look at offloading the checks using mod gearman to reduce the impact on the XI server, this would be my recommendation at what you can do to add more services and alleviate the system issues. There's just so much going with around 20K checks that you will need to do what you can to mitigate the impact such as using mod gearman, please see here for more information:

https://assets.nagios.com/downloads/nag ... ios_XI.pdf
https://support.nagios.com/kb/article.php?id=484

NOTE: Make sure that you follow the "Remote Worker Considerations" and the "Host groups and Service groups​" sections from the second link above and then follow the "Disable Worker​" section from the first link once you've setup your exclude groups.

Please read through this doc as well, with the number of checks you are running I would leave the DB local though at this point in time because of the large amount of total checks you have, it requires a lot of throughput to the DB:

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

You can only do so much on a single server, you'll need to do what you can to mitigate the impact.
[email protected]
Posts: 18
Joined: Mon Jul 06, 2020 10:21 pm
Location: Portland, ME

Re: Physical Database Sizing Recommendations

Post by [email protected] »

Hello,

Reviewing your last response and notice you mention setting up a separate XI server and mod-gearman as solutions. Want to confirm these are two separate solutions. Mod-gearman as a way to manage check/latency as the environment nears 20K total checks. Then once the environment exceeds 20K checks, are you implying a separate XI instance be stood up to handle growth greater than 20k checks (obviously give or take on these numbers / environment /types of checks etc)? If this is the case, do the two different XI instances talk with each other?

We've be asked to add our EU locations into the scope and this would put us over 20K total checks by end of 2021. Want to be sure I understand the options to handle this growth.

Thanks again,
Deirdre
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Physical Database Sizing Recommendations

Post by ssax »

They are separate solutions, either way at some point you'll likely hit the limit of what your single XI server can handle and your next step then would be to spin up another XI server. Setting up mod_gearman doesn't give you unlimited amounts of services, it allows you to add more by removing the load from the XI server for the performing of checks but eventually you'll hit the limit of what your system can handle, this varies a lot based on the system specs/storage speed/network speed/load of the systems you're checkin/plugins that you use/etc.

Mod_gearman is just for offloading the checks to an external worker server so that the load of those checks do not impact the XI server so you're able to get more services on a single XI system that you would without it. You will likely need to plan for mod_gearman given you are estimating 20K+ checks. I've seen single XI servers running 40-60K checks with mod_gearman but they had great specs/no bottlenecks/fast storage. The only way you'll know how your system performs at that level would be to baseline test it:

https://support.nagios.com/kb/article/n ... g-523.html
[email protected]
Posts: 18
Joined: Mon Jul 06, 2020 10:21 pm
Location: Portland, ME

Re: Physical Database Sizing Recommendations

Post by [email protected] »

Sean, post our QuickStart session yesterday, this ticket can be closed out.
Thank you again,
Deirdre
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Physical Database Sizing Recommendations

Post by scottwilkerson »

[email protected] wrote:Sean, post our QuickStart session yesterday, this ticket can be closed out.
Thank you again,
Deirdre
Locking thread
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked