Page 1 of 1
Distributed Monitoring on remote Site
Posted: Wed Sep 10, 2014 5:24 am
by litsupport.box
I have a vSphere 5.5 box on a few remote locations, where the bandwidth isn't always capable of returning results on a check from our NagiosXI box, we see several time-outs on a daily basis. Mostly the check results come through. But this way, we are getting false positives.
So I'm thinking to have a distributed monitor solution, a nagios box installed on the remote vSphere host, that monitors the same host and local network from this nagios box point of view.
This on several remote locations. All these nagios boxes report to our central nagiosXI box and graphing and configuration is done (preferably) centrally.
On the FAQ I read that there are several options to achieve distributed monitoring.
- Using MNTOS
- Using Mod Gearman
- Using DNX
- Using Fusion (not an option, because of the pricing)
What is the best solution for our enviroment, are there any other options to achieve the above?
Re: Distributed Monitoring on remote Site
Posted: Wed Sep 10, 2014 1:28 pm
by tmcdonald
mod_gearman is not a bad solution, but if bandwidth is an issue it might not help since you are sending the checks out anyway and I don't think the MG packets are that much smaller to make a difference.
From a Core server, you can use an OSCP command to forward the check results after each service check is run. See the following tutorial:
http://www.openlogic.com/wazi/bid/18813 ... ith-Nagios
Re: Distributed Monitoring on remote Site
Posted: Wed Sep 10, 2014 4:56 pm
by Box293
I would think that the box293_check_vmware plugin would be able to do what you are after.
Basically the plugin runs on a vMA appliance, so you can place that appliance on the remote location and this should overcome bandwidth issues.
Check out the plugin here:
http://exchange.nagios.org/directory/Pl ... re/details
Re: Distributed Monitoring on remote Site
Posted: Thu Sep 11, 2014 8:49 am
by litsupport.box
Thanks for your replies,
To be clear, not on all sites there is a bandwidth issue, but the connection just drops for a few minutes and comes back again. Sometimes there is a ping loss, other times it's replying, but takes too long.
Unfortunately, there's nothing to do about those issues. It's a problem connecting from one country in the world to another.
For the remote vSphere hosts, I'm (currently) only interested in the host usage stats (CPU, mem, net, datastore).
And the vMA-solution is already in use for our datacenter, where the Central Nagiosbox already lives. So this will be easier to install on the remote location. (Been there, done that

)
But in the future we will also want to monitor other hardware (eg. switches) is this possible with the vMA ?
With the Core-solution on the remote site, if the check results aren't sent, because the connection is lost or another issue. Is it sent afterwards, when the connection is good again?
Is there a way to check when the slave reported lastly and sent warn/crit notifications when this is too long ago?
Re: Distributed Monitoring on remote Site
Posted: Thu Sep 11, 2014 4:45 pm
by Box293
With the vMA solution ... you could install plugins on the vMA appliance and have those checks issued via check_by_ssh from the central location. That way the checks are executed at the remote site. However this isn't much different to the mod_gearman solution tmcdonald recommended.