redundancy

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
vvz
Posts: 187
Joined: Wed Oct 30, 2013 5:15 pm

redundancy

Post by vvz »

Hi!

I need to provide redundancy for nagios server I have.

Let's imagine that I have 2 nagios server installed - master and slave server (identical configs for the same hosts in LAN).

All notifications on slave are disabled.

Is it enough to activate notifications and replace retention.dat on slave with the latest one from master to start slave with the latest nagios configuration?

Thank you
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: redundancy

Post by abrist »

You are best off stopping the nagios service on the slave and checking the master with a cron, other wise you will essentially run every check twice - once from each server (although you may want to do this so that both servers have similar historical data).

I would suggest copying the configs as well. retention.dat will just contain the information of the last checks, whereas status.dat and objects.cache will contain the current object config and status.

Crap. Nagios core post. I missed that. Ignore my responses below:

Failover has different requirements depending on your environment and how far down the rabbit hole you are willing to go. As you configuration is stored in the db, you will have to have a solution to replicate that data as well. This can be done with offloaded dbs, though you have to be careful that both are not writing to the same db at the same time. Also, as the postgres db contains the user and ui configuration, you will need to come up with a novel way to duplicate that as well. This is harder because it contains the ip and system based config for the XI server. So if you were going to replicate or offload the postgres db, you would need to alter some bits of the postgres db on fialover, or use a virtual "floating" ip for both.

A floating ip requires some logic to deal with correctly, or you could end up in a stonith deathmatch, with both servers fighting for control when the master recovers.

And then enter Linux HA/DRBD/Pacemaker/Colosync.

There are a ton of options, and even more I did not mention. I am giving a presentation this fall on the very subject of high availability and failover in Nagios XI, so I am well versed in the subject at this point. This is possible, but it is not officially supported as of yet and growing an HA configuration will require a bit of time investment.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
vvz
Posts: 187
Joined: Wed Oct 30, 2013 5:15 pm

Re: redundancy

Post by vvz »

Thank you for your help.
I'll try.
I think we can close the thread
Locked