NagiosXI performance issue

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
anoop
Posts: 95
Joined: Tue Jun 25, 2013 1:22 am

Re: NagiosXI performance issue

Post by anoop »

Hi Team,
We are currently added 1385 hosts with 7800 services and out of that 3900 services are SNMP Related for Network Devices..

And our expected hosts and services are 4000 hosts and 40,000 services, where 15,000 services will be active and 25,000 checks will be passive..

Suggest us on above requirement..

How RamDisk and rrdcache will be useful on this Scenario..??
System:
Nagios XI Version : 2012R2.2 | PHP Version: 5.3.3
Offloaded MySQL DB on another virtual machine
16 CPU with 2 cores each | 32 GB RAM | 1 TB HDD
CentOS-6.3 |Total = 4,000 hosts| 40,000 services.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NagiosXI performance issue

Post by slansing »

I would recommend starting with a local installation of mod_gearman and then moving to remote workers when the need arises. Even a local installation can have a huge impact with increased performance. For a installation of that size you may need to look at 3 or more remote worker systems which can be used to divvy up the processing of checks.
anoop
Posts: 95
Joined: Tue Jun 25, 2013 1:22 am

Re: NagiosXI performance issue

Post by anoop »

Hi Team,

Thank you very much for replying, we will try with local mod_gearman.


We have following queries...

1. Our NagiosXI Server's ""Monitoring Engine Process "" automatically got stopped,
2. Some time due to this process issue graphs are not generated, how do we regenerate missing graphs.
3. IOPS wait going high more than 25% some time.

Thanks in advanced.
You do not have the required permissions to view the files attached to this post.
System:
Nagios XI Version : 2012R2.2 | PHP Version: 5.3.3
Offloaded MySQL DB on another virtual machine
16 CPU with 2 cores each | 32 GB RAM | 1 TB HDD
CentOS-6.3 |Total = 4,000 hosts| 40,000 services.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NagiosXI performance issue

Post by slansing »

It looks like your system may have some deeper issues that should be resolved before working on adding something less crucial at the moment "mod_gearman." Is the nagios process running?

Code: Select all

service nagios status

service ndo2db status

service crond status
anoop
Posts: 95
Joined: Tue Jun 25, 2013 1:22 am

Re: NagiosXI performance issue

Post by anoop »

HI Team,

Yes the nagios,ndo2b and crond services are running but sometimes if i check the status or restart the services it is showing sometimes "ndo2b lock" and "nagios lock" and after some time it is setting up properly... and sometimes graphs are not generatng for long time like 3 to 4 hours and again im restarting monitoring engine status and its coming ..

please let us know the resolution... thanks
System:
Nagios XI Version : 2012R2.2 | PHP Version: 5.3.3
Offloaded MySQL DB on another virtual machine
16 CPU with 2 cores each | 32 GB RAM | 1 TB HDD
CentOS-6.3 |Total = 4,000 hosts| 40,000 services.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: NagiosXI performance issue

Post by sreinhardt »

As we discussed earlier. The largest issue you are facing presently is mrtg in relation to your network snmp monitoring. There is a relatively easy route to take that slansing has mentioned on page two, which is to split the configuration so that it can run these checks separately, and not in a single go. Until you make these changes, there is nothing else we can do to help you, as the massive increase in load and performance issues are very much due to this.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
anoop
Posts: 95
Joined: Tue Jun 25, 2013 1:22 am

Re: NagiosXI performance issue

Post by anoop »

HI Team,

As per ur suggestion, i splitted up mrtg.cfg file into 4 files and some what performance is fine, but yesterday i configured VMware devices using vmware monitoring wizard for 33 base machine and 430 guest machine with 2500 service checks.. And today in my XI Server, load is increased and some performance also down. As when i planning to apply configuration, its taking hours and hours..

And still my graphs are not generating and i dig into some of my log files and find "npcd.log" file showing some error as

NPCD: ERROR: Executed command exits with return code '7'
[10-12-2013 19:11:17] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1381557654.perfdata.service.

And also tried some optionsby fixing the nagios xi perms, but still error exist..

Suggest us with better solution..
Thanks in advance
You do not have the required permissions to view the files attached to this post.
System:
Nagios XI Version : 2012R2.2 | PHP Version: 5.3.3
Offloaded MySQL DB on another virtual machine
16 CPU with 2 cores each | 32 GB RAM | 1 TB HDD
CentOS-6.3 |Total = 4,000 hosts| 40,000 services.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NagiosXI performance issue

Post by slansing »

You will want to open:

Code: Select all

/usr/local/nagios/etc/pnp/npcd.cfg

And edit the "load_threshold =" to something higher than your system load which is astronomical right now.

Then restart npcd:

Code: Select all

service npcd restart
its taking hours and hours..
This is not really possible due to timeout and memory limits in your php.ini file..

Have you checked into mod_gearman yet....? You need to get these performance issues resolved...before you continue to add on more objects and slow your system down again.
anoop
Posts: 95
Joined: Tue Jun 25, 2013 1:22 am

Re: NagiosXI performance issue

Post by anoop »

HI Team,

we started installing the mod_gearman in my Nagios XI server locally as we don't have resources at present and once we got resource, ill install another worker in remote Server.

I Installed like using the script and configured neb.conf and worker.conf, where i provided Nagios XI IP Address in Worker Installation.

I just left hosts=yes and services=yes like in the default file..

Is there anything else we need to configure in Mod_gearman apart from this steps..

Suggest us if anything require..

Thanks


2:

HI team,

we are planning to configure RAM disk..

so, we thought of using separate 1GB SAS Storage for the RAM Disk memory, as our status.dat and object.cache files consumed 20 MB of file size and it grows in future.. and also planning to provide ext4 file system instead of "tmpfs" filesystem..

How much impact it will take if i will use the configuration like the above. ??

Suggest us with a better solution..


Thanks in advance...
Last edited by slansing on Tue Oct 15, 2013 10:28 am, edited 1 time in total.
Reason: Please edit your previous post to add information/questions if you are the last poster.
System:
Nagios XI Version : 2012R2.2 | PHP Version: 5.3.3
Offloaded MySQL DB on another virtual machine
16 CPU with 2 cores each | 32 GB RAM | 1 TB HDD
CentOS-6.3 |Total = 4,000 hosts| 40,000 services.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NagiosXI performance issue

Post by slansing »

How much ram do you have available on the XI system? It looks like you looked at your .dat files already, but also keep in mind that you will need extra room for things like performance data. Depending on how much comes through you may need a larger ramdisk configuration. As far as mod_gearman goes that looks good. Just be sure you followed the steps in the documentation and made the necessary changes that are posted there.
Locked