Page 1 of 1

Spike in host down showing in Host Status Summary

Posted: Tue Jan 24, 2012 7:37 pm
by uhiadmin
Linux Distribution and version?
1. Cent OS 5.5 / 64 Bit
2. Manual Install
3. Gnome installed

To Nagios XI Tech Support Specialist:
Our host status summary would spike to 500 down when we only have three down. In CPU Stats, the I/O Wait would go red when it gets to a high percentage. Do I have to make adjustments in php.ini file since we added more hosts and services.

Total Hosts: 3367

Re: Spike in host down showing in Host Status Summary

Posted: Wed Jan 25, 2012 8:44 am
by scottwilkerson
This sounds like your Nagios XI machine may be overloaded. Can you post a screenshot of the server statistics on the XI homepage? Also, how many CPU's this system has, RAM, frequency of your checks.

Also, you may want to take a look at the following page on boosting XI performance
http://assets.nagios.com/downloads/nagi ... p#boosting

Re: Spike in host down showing in Host Status Summary

Posted: Wed Jan 25, 2012 2:54 pm
by uhiadmin
Sure,
I will send one to you when it happens again. Last time this happened we adjusted somethingin the php.ini file. Do we have to do that again when we added 2000 more hosts to be monitored????



current in the php.ini

max_execution_time = 30 ; Maximum execution time of each script, in seconds
max_input_time = 60 ; Maximum amount of time each script may spend parsing request data
memory_limit = 128M ; Maximum amount of memory a script may consume

what it needs to be changed to

max_execution_time = 60 ; Maximum execution time of each script, in seconds
max_input_time = 60 ; Maximum amount of time each script may spend parsing request data
memory_limit = 256M ; Maximum amount of memory a script may consume

4.

restart apache web server

service httpd restart

Re: Spike in host down showing in Host Status Summary

Posted: Wed Jan 25, 2012 3:04 pm
by scottwilkerson
Before making a recommendation I'd like to know what is getting loaded.

It sounds like you are getting a high load on your server from DISK I/O.

Take a look at
http://assets.nagios.com/downloads/nagi ... p#boosting

as it is a bunch of items relating to improving performance on your XI server.

Re: Spike in host down showing in Host Status Summary

Posted: Wed Jan 25, 2012 3:41 pm
by uhiadmin
The attached is what I am seeing from the system. Let me know if you can see the image.....

Re: Spike in host down showing in Host Status Summary

Posted: Wed Jan 25, 2012 3:56 pm
by scottwilkerson
That's what I thought.

I would recommend looking at this post and the one below it for some suggestions..

Re: Spike in host down showing in Host Status Summary

Posted: Wed Jan 25, 2012 6:32 pm
by uhiadmin
Its showing a spike again.

Re: Spike in host down showing in Host Status Summary

Posted: Thu Jan 26, 2012 11:02 am
by mguthrie
Can you run:

Code: Select all

service nagios stop
killall -9 nagios
service nagios start
I just want to verify that there aren't multiple instances of nagios running.

Also, I'm noticing that the check results for those down hosts do appear to be valid, so I'm wondering if the actual monitoring server is losing connectivity for a few minutes or seconds. If so, Nagios would have a HUGE number of event handlers, retries, and notifications to deal with, and I'm guessing this would cause your CPU spike.