Recommendations Required - Improvement in GUI Speed

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
delboy1966
Posts: 94
Joined: Thu Oct 22, 2015 5:26 am

Recommendations Required - Improvement in GUI Speed

Post by delboy1966 »

We have a pretty big Nagios environment which works pretty well apart from using the web GUI when it comes to calling pages that use extinfo.cgi and status.cgi.
Just wondering if anyone has found ways to improve the time these take to run and present results.

Our setup is;
Nagios Core 4.3.4 (I know 4.4.2 is out but currently mod_gearman doesn't work with the new version)
8 x distributed boxes running mod_gearman to do all the Nagios checks. (The main Nagios box does very little in running checks itself.)

We have 3280 hosts and 29732 services being checked and this works pretty well, all checks are run more or less when they should be with hardly any latency.

But when it comes to calling status.cgi to get reports it can take up to 8 seconds before it returns the data.
The main Nagios box has 64G RAM and 6 CPUs on a VM, we use a RamDisk for status.dat and the spool directory to try and speed up read and writes.
We are planning to up the RAM to 128G and the CPU to 10 in the next week or so but just wondered if there were any other tips we could implement.

Thanks in advance.

Tony
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Recommendations Required - Improvement in GUI Speed

Post by npolovenko »

Hello, @delboy1966. I don't think there is anything wrong with your system. With 30 thousand checks, 8 second load time doesn't seem very unusual. The problem with the status.cgi file is that it loads all 30+ thousand check results each time you load the page. The time Nagios spends on loading cgi files highly depends on the IO wait. Hard drive speed plays a major role in IO wait.
On the other hand, I looked through the Nagios Core issues on GitHub and I've seen at least a couple where people complained that the status.cgi takes up all the CPU load, and it seems that our developers are planning to improve this component in Core 5.X.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked