Page 1 of 1

NagiosXI Performance issues

Posted: Fri Sep 27, 2013 8:59 am
by anoop
HI Team,

We added 1200 Hosts and 5000 Active Service checks at present and going to add 4000 hosts and 40,000 services in total.
Our problem is while adding new devices and apply settings,
1. Its take too much time to complete the configuration for a single host.
2. We are getting error frequently like apply configuration failed, write configuration failed.
3. If I review the snapshots, everything is fine without any error.
4. It is taking too much time to open the reports page.
5. We are maintaining “5” minute interval for all 5000 checks
6. For network devices, we are using SNMP and nearly 700 devices are there for network which uses SNMP.
Nagios XI Server Specs are as below
32GB RAM
16 vCPUX2
Offloaded DB in remote server
Sometimes, we are getting blue mark for the Active Host checks/Service Checks and Notifications where they are getting disabled often.
And, I tried by checking with
#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Output showing with “no errors and no warnings”.
Please let us know the performance tuning parameters for the better performance of Nagios XI Server for 4000 devices with 40,000 checks where 15,000 checks will be SNMP for network devices and remaining will be Active/Passive Checks.
We are maintaining only Single Nagios XI Server with offloaded DB.
Help us out to resolve the problem.
Thanks for your help in advance.

Re: NagiosXI Performance issues

Posted: Fri Sep 27, 2013 9:17 am
by slansing
Is your offloaded DB using networked drives? This can cause high latency and a slowed if not crippled apply configuration process, have you investigated your remote mysql server's hardware? Do you get a specific error when trying to apply configuration such as "cannot connect to the DB"? How many items are you adding per apply configuration?

Re: NagiosXI Performance issues

Posted: Fri Sep 27, 2013 9:19 am
by tmcdonald
Off the top of my head:

1.) Apply Config does not work on a single host/service. When you update, it checks them all. So more checks = more time. Also as slansing said, your DB options come into play here.
2 & 3.) Care to post your profile.zip? It is under Admin -> System Config -> System Profile
4.) This is also related to the issue of having many checks and/or a DB not optimally configured.
5.) Some checks can usually afford to be a little later. Upping even a handful of non-critical checks to 10 minutes could help speed things up.
6.) Are you using active checks or passive checks/traps? Passive will help take some of the load off.

Re: NagiosXI Performance issues

Posted: Fri Sep 27, 2013 10:13 am
by anoop
Hi Team,

Thanks for a Quick reply and here are my answers for your queries,

1. We are not using any network drives and we hosted all these servers on VMware ESXi
2. Offlaoded DB Server hardware is same as Nagios XI Server with 32GB RAM, 16 vCPU
3. We didn't face any errors like "Cannot Connect to DB"
4. We are using Active checks for Network Devices and nearly 3000 checks are SNMP and remaining are also Active checks which are auto-discovered for public services.
5. In future, we are growing upto 40,000 checks where 15,000 will be SNMP/Active and remaining will be passive..

I am attaching my profile.zip file for your reference.

Help us for the better performance of my environment...

Thanks

Re: NagiosXI Performance issues

Posted: Fri Sep 27, 2013 10:36 am
by slansing
If you apply configuration without adding any new configs or changing any current ones does it still take a while? If so, how much time does it take, what is "way too long?" How frequent do you get the failed apply configuration is it always with adding new objects? How many objects are you trying to add at once when this fails?

Can you run the following and share the output from the output text file?:

Code: Select all

cd /usr/local/nagiosxi/scripts
 ./reconfigure_nagios.sh &> reconfig.txt