Page 1 of 1

Comments and Downtime failover syncronization

Posted: Mon Aug 05, 2013 2:44 pm
by mrollins13
We are attempting to configure a second nagios server as a live failover if the primary nagios server becomes unresponsive. We are able to maintain live status information for hosts and services on the failover server by passing results via NSCA. We also have both servers configured to use the same NagiosQL database, so they are able to write out the same configuration files.

*** We are unable to find a way to replicate the downtime and comments for hosts and services to the failover server. ***

We have attempted to rsync the retention file, and turn on retention settings in the nagios.cfg, however this does not include the comments and downtimes.
The nagiosql database does not appear to contain any tables for storing these.

QUESTIONS:

1) How are the comments and downtime stored in Nagios?
2) How can these be replicated to another server in realtime (or close to it) to support a failover server configuration?

Re: Comments and Downtime failover syncronization

Posted: Mon Aug 05, 2013 3:28 pm
by abrist
1. The comments are stored in the file:

Code: Select all

/usr/local/nagios/var/status.dat
2. I don't have a great suggestion here, but you could netmount the file to redundant wan storage . . .

Re: Comments and Downtime failover syncronization

Posted: Mon Aug 05, 2013 4:23 pm
by mrollins13
We are aware of the status file, however the documentation states "This file is deleted every time Nagios stops and recreated when it starts." ,which leads me to believe this is a volatile location, and is not the storing mechanism used to record downtime and comments.
If "The status file" is the only answer, then I suppose the next question would be why the retention file is not being read in problem and restoring the comments/downtime as the documentation indicates should happen.

Re: Comments and Downtime failover syncronization

Posted: Mon Aug 05, 2013 4:41 pm
by scottwilkerson
The should be in both the status.dat AND the retention.dat

One thing to note is that you would need to have nagios stopped when you copy over the retention.dat and then start the service, otherwise it would overwrite your changes before it stopped.

This would all be assuming that the you had all the correct retention options set in the nagios.cfg

Re: Comments and Downtime failover syncronization

Posted: Mon Aug 05, 2013 4:59 pm
by mrollins13
@scottwilkerson - TY! You nailed it! The only change I had to make was stopping the failover nagios, then sync'ing the retention file. After starting the nagios process again all of the downtime and comments showed up. Our automated startup script does a verify and restart anyway, so no problem adding this into the process.