Page 1 of 1
Offloading mrtg and switch checks
Posted: Thu Nov 06, 2014 5:30 am
by WillemDH
Our network team wants to start monitoring all our switches with Nagios XI, see post
http://support.nagios.com/forum/viewtop ... 16&t=29970
As I'm a bit worried about the load on our Nagios server, I was wondering if mrtg could be offloaded to another server, maybe with mod_gearman?
If I follow this procedure
http://assets.nagios.com/downloads/nagi ... ios_XI.pdf could I offload mrtg too? Or is this only for the service checks?
Grtz
Willem
Re: Offloading mrtg and switch checks
Posted: Thu Nov 06, 2014 2:03 pm
by sreinhardt
Hey Willem! This is something I have been looking into while containerizing our products and attempting to separate XI into specific components. It is absolutely possible, just a matter of how you wish to do it. For the containers, if I keep them on the same system I can logically separate mrtg and nagios, but still share files as though it were the same disk. Having completely separate systems provides a bigger issue of getting the data between them both.
The two main parts you have to worry about are:
/etc/mrtg/conf.d/ - Apache needs to be able to place configs here for mrtg to use, this happens when you run the wizard.
/usr/lib/mrtg/ - This stores the rrd files that MRTG creates, and needs to be readable via check_mrtgtraff to get the metrics into XI
The nice thing about both cases, is that unless you have thousands upon thousands of interfaces, using a shared storage or network connection from XI to a local mrtg disk, it would not be a terrible impact on the XI side of things, not to have the storage local. Its a very small check at the front of the rrd, totaling a few hundred bytes of transfer per check. The disk should be speedy and local for MRTG though, as it is constantly writing to all of those rrd files, one per interface. I'm not sure how well that will fit in your environment, but maybe it gives you some ideas to start with?
Re: Offloading mrtg and switch checks
Posted: Thu Nov 06, 2014 3:45 pm
by WillemDH
Hey Spenser,
Too bad, I have almost no knowledge of containers. Is this something you or Nagios support can help me with? Are you talking about LXC containers or Docker containers? As I already have a running Nagios XI server, I suppose I cannot make use of a Nagios container? Is using containers still an option then?
Is mod_gearman even involved in the setups you are describing? You are speaking of a local disk. I could provide an extra local SSD based disk to the Nagios XI server, but I'm more worried about CPU load. I was thinking mod_gearman was the only solution to make the checks execute remotely and limit the impact on cpu.
I'm not 100 % sure about how many interfaces we are speaking here, but let's say 360 switches x 25 ports = 9000 interfaces?
Grtz
Willem
Re: Offloading mrtg and switch checks
Posted: Thu Nov 06, 2014 4:27 pm
by sreinhardt
In your, and most, case(s) containerizing won't help with the actual load issues presented by running both applications. You are right to be worried about mrtg's additional load. It's a great addition to XI, but it does have a fair amount of overhead. For 9k interfaces, I would definitely suggest a secondary system. The route I would take is something like:
MRTG server:
Soley reaps information from your network devices, and stores in rrd files.
Shares /etc/mrtg/conf.d/ with read/write access.
Shares /var/lib/mrtg/ with read only access
XI server:
Mounts both shares from MRTG into the same directory on the local system
This is probably the simplest and possibly one of the most effective routes to handling this particular issue. The only two concerns you really have here, is does the mrtg server have enough power for all those checks, and is the latency low enough for nagios to be happy(it should be, even for 9k checks this is maybe a few mb every 5minutes). You could of course have both systems instead mount a single shared san or something along those lines if you wish, just one example.

Re: Offloading mrtg and switch checks
Posted: Thu Nov 06, 2014 4:52 pm
by WillemDH
Spenser,
Thanks for the update. I'll discuss the options with my colleagues next week, as I have a holiday untill Wednessday) Providing an mrtg system that can process 9k interfaces should not be a problem I think. We could start slow and build up exponentially.
A single shared san is also an option, but I will also have to talk this over with my boss. I'm just not really sure how to mount this to a CentOS VM, as we only use this for Windows SQL and SAP clusters atm.
So, as you didn't answer my question regarding mod_gearman, does this mean mod_gearman would not be involved in this scenario? Would there be no way to run this mod_gearman on the mrtg server too? I also have no experience with mod_gearman, so forgive me if I'm asking stupid questions...

( I should read the manual asap and do some tests with it I guess)
Grtz
Willem
Re: Offloading mrtg and switch checks
Posted: Thu Nov 06, 2014 5:52 pm
by sreinhardt
Not a dumb question at all. If you choose to integrate gearman, you will inevitably run the nagios part that checks rrd files from mrtg... which thinking about it might provide an absolutely fantastic way of NOT sharing the mrtg rrds, and just keeping them and check_rrdtraff on the same server. You would then set the bandwidth checks to a specific service or host group, and assign the gearman worker on the mrtg server to that hostgroup only. This way all of the mrtg processing happens on the remote server away from XI. The only thing is you will still need to share or rsync /etc/mrtg/conf.d, but thats pretty minimal.
Other than the previously mentioned idea, gearman and mrtg really do not interact. I don't know if you could leverage gearman's queue and workers with mrtg directly or not. It's definitely an interesting idea! In case you were not aware, although it sounds like you may be, mtrg and core are not directly linked in any way other than rrd files. MRTG is entirely run from a cron.
Hope that helped, let me know if you come up with anything else!
Re: Offloading mrtg and switch checks
Posted: Fri Nov 07, 2014 3:05 am
by WillemDH
Well, I was hoping someone had done this before me... I'll have to discuss this with my colleagues if it is worth a try. So one more question that was left unanswered, can Nagios support help me through the process (in the forum) of getting to an offloaded mrtg or would this setup be 'unsupported'?
If I would have a major issue with the offloaded mrtg server in a few months, will Nagios support help me solve it or will they say "This is not within your support agreement"....?
One more question. If mrtg would be offloaded in combination with mod_gearman to another server, would this mean I can no longer use mrtg on the Nagios XI server?
Grtz
Willem
Re: Offloading mrtg and switch checks
Posted: Fri Nov 07, 2014 1:34 pm
by sreinhardt
will they say "This is not within your support agreement"....?
Oh come on now, how many times have we actually said this? In all seriousness, While it's a little different, the idea behind it is nothing different that what we do already. I see very little if any reason you would get that line out of us for this. If you were having to modify the underlying compnents to do this, then we would have a different situation, but since neither XI nor mrtg are actually being changed there shouldn't be a conflict.
2) Most likely yes, without some alterations to either how the wizard places mrtg configs, or to how your mrtg install on the xi server works. Can it be done? Yes absolutely you could use mrtg on both systems. I can't think of a reason you would want to though. This did spark an awesome idea of distributed mrtg checkers through gearman, and managment through a component or addition to the wizard.
Re: Offloading mrtg and switch checks
Posted: Fri Nov 07, 2014 6:58 pm
by WillemDH
It's reassuring to know that support on mod_gearmman is part of the package. I wasn't expecting a negative response, but had to be sure before I would consider starting such a project. I'll install a CentOS 6.5 server asap, which will hopefully be within 2 weeks or so.
I heared my colleague Michiel had some issues monitoring 10Gb interfaces with the wizard. Is this a known issue? Something with 32bit and 64bit counters.
Re: Offloading mrtg and switch checks
Posted: Mon Nov 10, 2014 1:23 pm
by sreinhardt
1gb and above, need the highcounters(64bit) unless the interface is using far less bandwidth than it potentially could, as they can easily overflow a 32 bit counter in a single 5 minute interval. I know some devices don't play nice with cfgmaker, but by and large 10gb ports should be working.