Page 1 of 3
mrtg.cfg is huge
Posted: Fri Oct 11, 2013 2:18 pm
by snapon_admin
I'm in the process of deploying our Nagios XI config and we've gotten to a point where our CPU load has gotten high enough to start causing noticeable performance issues. Currently we're sitting at 960 hosts and 4750 services. Doing some research I saw that a large mrtg.cfg file can be one of the culprits for high CPU usage (our active checks are every 5 minutes). Our mrtg.cfg file is pretty borked as it is, since there are a ton of entries in there for interfaces we don't need bandwidth monitoring on. The file is about 417,000 lines long...
I was wondering if there's anywhere that I can find a base mrtg.cfg file so that I can start from scratch and ad bandwidth monitoring on only the interfaces we want it on, starting with a fresh file. Or would simply deleting the mrtg.cfg file take care of that (i.e. does deleting it completely mean that it will be re-created when I try to add bandwidth monitoring on a switch or router interface)?
Re: mrtg.cfg is huge
Posted: Mon Oct 14, 2013 2:41 am
by lmiltchev
You can remote the entries that you don't need from the mrtg.cfg file by following the steps, outlined in this document:
http://assets.nagios.com/downloads/nagi ... Router.pdf
After trimming down the mrtg.cfg, you can use "bandwidth" as a keyword to perform a search in the CCM, select and remove all of the services that you don't need in one go.
You can also implement mod gearman to improve performance and reduce load.
http://assets.nagios.com/downloads/nagi ... ios_XI.pdf
Re: mrtg.cfg is huge
Posted: Mon Oct 14, 2013 12:37 pm
by snapon_admin
There were so many entries that needed to be removed it was just easier to delete mrtg.cfg and re-create it, which I was able to do by following instructions in another post on these forums. However, I am still noticing a problem. After deleting mrtg.cfg and deleting all of the bandwidth services from CCM, I tried re-adding one interface on one of our routers and it reported perfectly and displayed graph data correctly (hurray!). Seeing that this worked I tried adding another interface on another router, and I'm getting "OK - Current BW in: 0Kbps Out: 0Kbps" for the service check and no graph data.
I know for a fact this interface is not passing 0 data, and it in fact has generated an alert for high utilization in SolarWinds, so I'm not sure why this one doesn't work. These are the only 2 interfaces that have been added to mrtg.cfg. If it matters, the Nagios server and the first interface (the one that works) are both located in the U.S., and even in the same state. The interface that does not work is on a router located in Germany.
Re: mrtg.cfg is huge
Posted: Mon Oct 14, 2013 1:54 pm
by abrist
Were there any changes to the broken interface (ip,snmp community,etc)?
Re: mrtg.cfg is huge
Posted: Mon Oct 14, 2013 2:22 pm
by snapon_admin
Nope. It's actually the same SNMP community string as the working router, since these are all managed routers from the same provider. I just re-added both of these today, the one that doesn't work hasn't worked at all since it was re-added.
Re: mrtg.cfg is huge
Posted: Mon Oct 14, 2013 2:25 pm
by slansing
Did you split your mrtg.cfg into separate config files? The purpose of this would be to alleviate the system stress as each one would be called individually based on the checks being ran, instead of one huge mrtg.cfg that could take a long time to read through and could time out thus checks on interfaces after that point would never be ran.
Re: mrtg.cfg is huge
Posted: Mon Oct 14, 2013 3:20 pm
by snapon_admin
I have not done that as of yet, but there's only 2 entries in the new mrtg.cfg file. One interface on a router in the U.S. and one interface on a router in Germany.
Re: mrtg.cfg is huge
Posted: Tue Oct 15, 2013 1:48 am
by lmiltchev
Run the following command and show the output:
I tried adding another interface on another router, and I'm getting "OK - Current BW in: 0Kbps Out: 0Kbps" for the service check and no graph data.
Can you show that actual command that you are running from the command line? What is the "-l" argument that you are using? Does it match the amount of traffic going through the port? Have you tried using "-l B"?
Code: Select all
/usr/local/nagios/libexec/check_rrdtraf -f /var/lib/mrtg/<ip address>_<port>.rrd -w 20,20 -c 50,50 -l B
Re: mrtg.cfg is huge
Posted: Tue Oct 15, 2013 10:00 am
by snapon_admin
Code: Select all
[root@localhost libexec]# cat /etc/sysconfig/i18n
LANG="en_US.UTF-8"
SYSFONT="latarcyrheb-sun16"
NON-working router CLI using "-l K" (what it's currently set at), and "-l B". The NON-working interface is an older serial interface (Se0/1/0.100). This router is located in Germany:
Code: Select all
[root@localhost libexec]# ./check_rrdtraf -f /var/lib/mrtg/<IP Address>_7.rrd -w 370,370 -c 422,422 -l K
OK - Current BW in: 0Kbps Out: 0Kbps|in=0Kb/s;370;422 out=0Kb/s;370;422
[root@localhost libexec]# ./check_rrdtraf -f /var/lib/mrtg/<IP Address>_7.rrd -w 3070,3070 -c 4022,4022 -l B
OK - Current BW in: 0 bps Out: 0 bps|in=0 b/s;3070;4022 out=0 b/s;3070;4022
[root@localhost libexec]#
This isn't a very large circuit, so hence the K argument. It's only a 512K CDR in this location. Some more information on the interface that works and another one I added that also works and has a similar interface as the NON-working router:
Working router is an ethernet interface (Gi0/1.50). This Router is located in Illinois.
CDR on working interface is 40M so the threshold is set accordingly. Here is that CLI output:
Code: Select all
[root@localhost libexec]# ./check_rrdtraf -f /var/lib/mrtg/<IP Address>_8.rrd -w 30,30 -c 35,35 -l M
OK - Current BW in: 9.61Mbps Out: 4.83Mbps|in=9.615410Mb/s;30;35 out=4.833507Mb/s;30;35
For testing I added another router with a similar interface to the NON-working router (also Se0/1/0.100) with a 128K CDR. This router is located in Texas, and DOES work. CLI:
Code: Select all
[root@localhost libexec]# ./check_rrdtraf -f /var/lib/mrtg/,IP Address>_8.rrd -w 105,105 -c 115,115 -l K
OK - Current BW in: .29Kbps Out: .42Kbps|in=.299719Kb/s;105;115 out=.421409Kb/s;105;115
All IP addresses redacted for obvious reasons.
Re: mrtg.cfg is huge
Posted: Tue Oct 15, 2013 10:38 am
by snapon_admin
Update:
Added another router interface located in Italy. This interface is part of an IMA bundle so the interface added is ATM/IMA0.40 and this interface works. So the only one not working is the router in Germany. I'm going to remove this interface from mrtg.cfg and CCM and re-add it to see if that has any effect and will let you know. Maybe something just got borked when adding that interface...