Distributed Nagios Architecture

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Smark
Posts: 32
Joined: Tue Jan 08, 2013 6:12 pm

Distributed Nagios Architecture

Post by Smark »

We have multiple DataCenters across the globe, does Nagios provide a solution for distributing checks across the globe to local Nagios servers with a central management server coordinating the work?

I could imagine it working like this:
* Central Nagios server that includes the web interface, CCM, and some sort of a coordination agent
* Local Nagios server at each DataCenter that checks just the hosts "near" it and reports back to the central server.
* The Local Nagios server would continue to check the local hosts and collect data even if it could not connect to the central server.

I was looking at DNX but that prefers them all to be in the same DataCenter and is just for distributing the load of the checks, not for reducing WAN traffic or reliance on a WAN connection for checks.

Any help you could provide here would definitely be beneficial!

Thanks!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Distributed Nagios Architecture

Post by scottwilkerson »

In Nagios XI many clients use Inbound / Outbound data transfer to send the distributed items between Nagios servers.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Smark
Posts: 32
Joined: Tue Jan 08, 2013 6:12 pm

Re: Distributed Nagios Architecture

Post by Smark »

scottwilkerson wrote:In Nagios XI many clients use Inbound / Outbound data transfer to send the distributed items between Nagios servers.
Hi Scott,

Can you clarify this at all? Maybe a doc link or a screenshot?

Thanks for your prompt response.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Distributed Nagios Architecture

Post by abrist »

XI servers can send and receive checks to/from other XI servers. Using the mechanisms outlined below, you can essentially use local nagios servers for regional offices and then have all them push their checkresults to a central server (or multiple central servers).
Inbound
Outbound
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH
Contact:

Re: Distributed Nagios Architecture

Post by BanditBBS »

I of course have to throw this out there...I use md_gearman for this exact reason. That way I only have to configure one XI server and I have the host/service checks going to specific workers where the machines are located.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Smark
Posts: 32
Joined: Tue Jan 08, 2013 6:12 pm

Re: Distributed Nagios Architecture

Post by Smark »

abrist wrote:XI servers can send and receive checks to/from other XI servers. Using the mechanisms outlined below, you can essentially use local nagios servers for regional offices and then have all them push their checkresults to a central server (or multiple central servers).
Inbound
Outbound
This is perfect! For some reason I had it in my head that Passive Checks had to do with the Nagios Agent sending results to the server. I wasn't aware it was also used for server-to-server communication.

I've read through those documents and a few of the ones they reference. I'm still a little confused on how the configuration is managed. Based on those documents each Nagios Server has it's own independent host/service/etc config and is set to forward results to another server. I assume there is no way to have central management of all of the configuration? How are configs generally managed in these types of environments?

Typically, how is a solution like this architected? One master server and N Nagios nodes that all forward their check results to the master server?
BanditBBS wrote:I of course have to throw this out there...I use md_gearman for this exact reason. That way I only have to configure one XI server and I have the host/service checks going to specific workers where the machines are located.
So I'm looking at their documentation and as I understand it, you only need one Nagios server and then a bunch of Gearman Job Servers, one in each location to actually execute the checks, right? What have you seen are the drawbacks? Right now this looks like a good direction to pursue.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH
Contact:

Re: Distributed Nagios Architecture

Post by BanditBBS »

Smark wrote:So I'm looking at their documentation and as I understand it, you only need one Nagios server and then a bunch of Gearman Job Servers, one in each location to actually execute the checks, right? What have you seen are the drawbacks? Right now this looks like a good direction to pursue.
The compatability with Nagios Core 4.x(XI 2014) requires a special version and I haven't fully tested it yet myself as I am still running 2012 here. Other than that, I can't think of any drawbacks, it works great in my environment.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Smark
Posts: 32
Joined: Tue Jan 08, 2013 6:12 pm

Re: Distributed Nagios Architecture

Post by Smark »

BanditBBS wrote:
Smark wrote:So I'm looking at their documentation and as I understand it, you only need one Nagios server and then a bunch of Gearman Job Servers, one in each location to actually execute the checks, right? What have you seen are the drawbacks? Right now this looks like a good direction to pursue.
The compatability with Nagios Core 4.x(XI 2014) requires a special version and I haven't fully tested it yet myself as I am still running 2012 here. Other than that, I can't think of any drawbacks, it works great in my environment.
Awesome! I'm spinning up some servers now to play with it. I'll report back soon.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Distributed Nagios Architecture

Post by slansing »

You can run local gearman workers as well, but if you want a truly distributed checking environment then yes, you will want to have a handful of job servers, if you are using 2014/Core4 you will want to follow my post at the bottom of the second page here:

http://support.nagios.com/forum/viewtop ... n&start=10
Smark
Posts: 32
Joined: Tue Jan 08, 2013 6:12 pm

Re: Distributed Nagios Architecture

Post by Smark »

So I have everything working, sort of. When looking at the hostgroups and servicegroups option in mod_gearman_worker.conf you can specify which hostgroups and which servicegroups should be executed by which workers.

In our environment we have servers dispersed around the world so it makes more sense to say "any services on hosts in this hostgroup should be checked by this worker". Does that functionality exist?
Locked