Hi,
We are implementing a new nagios instance for a client and have currently configured close to 2500 service check in addition to 420+ hosts.
Our server Configuration:
Red Hat Enterprise Linux 5 -Virtual Server
Processor Qty: 4,
Processor Speed: 2.66 GHz,
Total Memory: 8GB,
HD Capacity: 128GB, HD Config: "/"46 GB + "/apps"64GB
we are monitoring about 110 servers using nsclient to monitor local parameters like CPU, memory, services, page file, etc.. and about 300 websites for DNS resolution, HTTP & HTTPS ports and ping services. (polling intervals range from 5-15 mins)
My cpu Load is hovering around 3-4 for 1,5,15 mins. however when I look at the monitoring info, service check latency is more than 2 sec at the moment. Please suggest if this is in acceptable limits for a small environment like this.
Also, I cant see the monitoring Engine Event Queue graph currently on the admin page. Is there any location where we can download this graph and also get the historical data for the "Monitoring Engine Check Statistics" and "Monitoring Engine Performance".
Nagios Server Performance
Re: Nagios Server Performance
With 4 cpus, a load average of 3-4 will is *acceptable* as it will not cause cpu wait, but is definitely higher than expected for the number of checks you have set up. What are the intervals on these checks? How many checks every 5 minutes?MSPk wrote:My cpu Load is hovering around 3-4 for 1,5,15 mins. however when I look at the monitoring info, service check latency is more than 2 sec at the moment. Please suggest if this is in acceptable limits for a small environment like this.
The evnt queue graph requires internet access as it relies on the google's chart api. If you have internet access, but it is not working, it may be proxy related or security policies in your environment do not trust chart.googleapis.com.MSPk wrote:Also, I cant see the monitoring Engine Event Queue graph currently on the admin page. Is there any location where we can download this graph and also get the historical data for the "Monitoring Engine Check Statistics" and "Monitoring Engine Performance".
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Nagios Server Performance
there are 1650 services checks and 300 host checks with polling intervel set as 5mins and about 900 service checks and 15min polling interval.
Re: Nagios Server Performance
From the monitoring engines statistics page the average checks in 5mins ranges from 1800 to 2100 checks
Re: Nagios Server Performance
with those numbers, your 4 core system is just about on par with our suggested hardware requirements doc:
http://assets.nagios.com/downloads/nagi ... ements.pdf
http://assets.nagios.com/downloads/nagi ... ements.pdf
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Nagios Server Performance
I agree that the h/w recommedation go with 2.5k services + 250 host checks but something looks to be out of shape. we have implemented monitoring for around 300+ server and 100 network elements recently and even now the load is always around 1-1.5. for the currrent setup I see the CPU load going up to 7-8 some times and will reach 14-17 when there are any config changes made on the system. are their any checks to see that every thing is indeed fine?
Re: Nagios Server Performance
Check the performance --> Home --Performance. How many service and host checks every 5 minutes? You will see the load spike when you apply config as it will use max resources.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Nagios Server Performance
Here is a suggestion that you may consider (if you haven't already).
Sometimes it's very easy to run through all the wizards and configure the default check interval as 5 minutes. What can occurr is a lot of checks are being scheduled to run at the same time and then at other times the XI host goes quiet.
Perhaps look at setting some checks to run at different intervals. For example some checks you may still want them to occur frequently, so instead of 5 minutes you could use 4 minutes or 6 minutes. This shifts the scheduling for these checks to occur at a more random frequency.
Also there may be some checks that aren't really needed to be performed every five minutes. Perhaps DNS resolution is something that might only get checked every 19 minutes for example.
I also use Nagios XI for auditing to ensure standards we have enforced remain in place, or for data gathering purposes. For a lot of these checks I schedule to check ever 1200 minutes (20 hours).
I hope this is of some help to you.
Sometimes it's very easy to run through all the wizards and configure the default check interval as 5 minutes. What can occurr is a lot of checks are being scheduled to run at the same time and then at other times the XI host goes quiet.
Perhaps look at setting some checks to run at different intervals. For example some checks you may still want them to occur frequently, so instead of 5 minutes you could use 4 minutes or 6 minutes. This shifts the scheduling for these checks to occur at a more random frequency.
Also there may be some checks that aren't really needed to be performed every five minutes. Perhaps DNS resolution is something that might only get checked every 19 minutes for example.
I also use Nagios XI for auditing to ensure standards we have enforced remain in place, or for data gathering purposes. For a lot of these checks I schedule to check ever 1200 minutes (20 hours).
I hope this is of some help to you.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Nagios Server Performance
This. Some checks like disk space checks, among others, do not need to be run every 5 minutes. Choose your intervals wisely if you are on a large install.Box293 wrote:Perhaps look at setting some checks to run at different intervals.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Nagios Server Performance
Yes, I have already fine tuned the polling intervals and the thresholds for different zone (dev beng chked every 15 mins while prod in every 5 mins). I believe there is some issue with the php files (dashboards) and MySQL instance. every time we login the intial screen the load is around 3-4, once we start using the dashboards and try to extract reports thats when the cpu load spikes eratically. MySQLd takes 90-100% CPU whenever we click on the notificatifications or try to extract the notifications report, It jus crashes without showing any info.