This is again related to the same issue discussed in
http://support.nagios.com/forum/viewtop ... 16&t=32043
It was ok for a while after we offloaded all to ram disk.
However the issue came back.
It is extremely frustrating to baby sit Nagios whole day because the scheduling will just go down to about 10.
Disabled mod_gearmand. Still same issue.
CPU resources, memory IO all ok.
FYI, we are running all active checks only. About 2050 hosts, 17000 services with checks every 5 minutes.
Is this Nagios limitation?
Would appreciate a fast resolution to this issue.
Nagios Scheduling Issue
Nagios Scheduling Issue
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
Re: Nagios Scheduling Issue
We don't have an "official" document on this, but Nagios XI can monitor up to 20000 services (I mean checks in total -> hosts + services), provided the general guidelines on the hardware requirements needed to run XI have been followed.FYI, we are running all active checks only. About 2050 hosts, 17000 services with checks every 5 minutes.
Is this Nagios limitation?
Having said that, I would like to point out that this is all relative. What we are talking about here is a clean, "vanilla" setup, mixture of active and passive checks, fast hard drives, etc. Each environment is different though. If you are running mostly or only active checks, in you have lots of CPU intensive checks (vmware, snmp, perl scripts, etc.), the performance will suffer.
If you have done everything that you could to tweak your configs, and boost the performance but you are still having issues, I would recommend adding another XI instance (splitting your existing XI instance).
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Nagios Scheduling Issue
The CPU, Memory & I/O are all ok.lmiltchev wrote:We don't have an "official" document on this, but Nagios XI can monitor up to 20000 services (I mean checks in total -> hosts + services), provided the general guidelines on the hardware requirements needed to run XI have been followed.FYI, we are running all active checks only. About 2050 hosts, 17000 services with checks every 5 minutes.
Is this Nagios limitation?
Having said that, I would like to point out that this is all relative. What we are talking about here is a clean, "vanilla" setup, mixture of active and passive checks, fast hard drives, etc. Each environment is different though. If you are running mostly or only active checks, in you have lots of CPU intensive checks (vmware, snmp, perl scripts, etc.), the performance will suffer.
If you have done everything that you could to tweak your configs, and boost the performance but you are still having issues, I would recommend adding another XI instance (splitting your existing XI instance).
Only 20000 limit? Is this a typo?
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
Re: Nagios Scheduling Issue
20,000 is an estimate as to the amount of checks a single Nagios 4.x based server can handle without performance-enhancing modifications. I see that you have a RAM disk in place, which would speed things up a bit - and of course there's mod_gearman, which further increases that threshold. I understand that you also have mod_gearman in place.Only 20000 limit? Is this a typo?
It's not a 'hard' limitation, but there are boundaries to what a single Nagios server can process - which is why lmiltchev suggested splitting your XI server in two.Is this Nagios limitation?
From your old thread:
Just to clarify, you didn't increase the amount of checks being done between these upgrades?Before Upgrade
Nagios 2014R1.2 with check_gearman: version 1.4_nagios4 running on libgearman 0.25
Everything was scheduling fine.
After upgrade
Nagios 2014R2.6 with check_gearman: version 1.5.0b1 running on libgearman 1.1.8
Serious scheduling issues howering most of the time around 60 and rare bursts around 800.
CPU seems OK, Memory seems ok so it must be some other bottleneck.
NDOUtils ok, no crashed tables in MySQL.
Re: Nagios Scheduling Issue
Jolson:
We add hosts and services almost every day, some days a lot more than others. Sorry I do have the details.
We are currently having 2600 hosts and 21742 services in our first instance of NagiosXI.
Some updates. Did a remote session with Andy a few weeks ago.
After the following was set in nagios.cfg all the scheduling problems went away.
I think it also solved the Apply Configuration problem as adding back sudo for nagios is still ok now.
auto_reschedule_checks=0
use_retained_scheduling_info=1
We add hosts and services almost every day, some days a lot more than others. Sorry I do have the details.
We are currently having 2600 hosts and 21742 services in our first instance of NagiosXI.
Some updates. Did a remote session with Andy a few weeks ago.
After the following was set in nagios.cfg all the scheduling problems went away.
I think it also solved the Apply Configuration problem as adding back sudo for nagios is still ok now.
auto_reschedule_checks=0
use_retained_scheduling_info=1
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
Re: Nagios Scheduling Issue
I'm glad to hear that you and Andy got this worked out. Is there anything further I can help you with here?
Re: Nagios Scheduling Issue
No. Please close this ticket. Thanksjolson wrote:I'm glad to hear that you and Andy got this worked out. Is there anything further I can help you with here?
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation