High availability Nagios XI

maxwellmiranda · Post by **maxwellmiranda** » Tue Jul 03, 2012 10:53 am

can you give the details for setting up High availability for Nagios XI server...
we are using Linux 64 bit server....

lmilkovic · Post by **lmilkovic** » Thu Jul 05, 2012 4:32 am

The official document about the HA options for Nagios is available here: http://assets.nagios.com/downloads/nagi ... ptions.pdf
Unfortunately I didn't find it very informative:(

If you're implementing HA solution from the beginning (i.e. you haven't deployed Nagios yet), and you have the needed resources (shared storage), it is relatively straight-forward when you use Linux-HA and DRBD. These solutions are true "heavy-weight" HA implementations and are well documented both on offical pages and additional tutorials.

However, if you already have one Nagios XI server in your environment and you want to add another, without shared storage, disk drivers, etc., you're on your own:(

When implementing custom DR/HA solution without shared storage, you have to take these things into account (at minimum) when syncing configuration:

NDO database
Nagios Core configuration files, state files and plugins
PNP4Nagios perfdata files
Nagios XI database

For NDO database (since it's a MySQL database), it's best to use integrated MySQL replication. This replication must be dual-master, since another node (slave) can become master at any time. This procedure is not so straight-forward unfortunately and takes some time to get it right.
For plain files you can setup cron job and rsync the files between the nodes.
Nagios XI database is a PostgreSQL database. I found PostgreSQL replication mechanism rather cumbersome and, since the database is relatively "low-activity", it is easier to dump the database, rsync it to other node and import it there.

When you have all relevant files synced, you have to implement a watchdog.
We implemented a robust watchdog/heartbeat service that runs on both nodes and checks the other node. Based on different conditions (for example, host unreachable or Nagios process down) and different logic (for example, Condition1 AND Condition2 OR Condition3), service can automatically start failover or perform additional actions.

Hope this helps.

Luka

scottwilkerson · Post by **scottwilkerson** » Thu Jul 05, 2012 11:41 am

Thanks for chiming in Luka

Nagios Support Forum

High availability Nagios XI

High availability Nagios XI

Re: High availability Nagios XI

Re: High availability Nagios XI