We are attempting to configure a second nagios server as a live failover if the primary nagios server becomes unresponsive. We are able to maintain live status information for hosts and services on the failover server by passing results via NSCA. We also have both servers configured to use the same NagiosQL database, so they are able to write out the same configuration files.
*** We are unable to find a way to replicate the downtime and comments for hosts and services to the failover server. ***
We have attempted to rsync the retention file, and turn on retention settings in the nagios.cfg, however this does not include the comments and downtimes.
The nagiosql database does not appear to contain any tables for storing these.
QUESTIONS:
1) How are the comments and downtime stored in Nagios?
2) How can these be replicated to another server in realtime (or close to it) to support a failover server configuration?
Comments and Downtime failover syncronization
-
- Posts: 5
- Joined: Wed Jul 31, 2013 9:48 am
Re: Comments and Downtime failover syncronization
1. The comments are stored in the file:
2. I don't have a great suggestion here, but you could netmount the file to redundant wan storage . . .
Code: Select all
/usr/local/nagios/var/status.dat
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
- Posts: 5
- Joined: Wed Jul 31, 2013 9:48 am
Re: Comments and Downtime failover syncronization
We are aware of the status file, however the documentation states "This file is deleted every time Nagios stops and recreated when it starts." ,which leads me to believe this is a volatile location, and is not the storing mechanism used to record downtime and comments.
If "The status file" is the only answer, then I suppose the next question would be why the retention file is not being read in problem and restoring the comments/downtime as the documentation indicates should happen.
If "The status file" is the only answer, then I suppose the next question would be why the retention file is not being read in problem and restoring the comments/downtime as the documentation indicates should happen.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Comments and Downtime failover syncronization
The should be in both the status.dat AND the retention.dat
One thing to note is that you would need to have nagios stopped when you copy over the retention.dat and then start the service, otherwise it would overwrite your changes before it stopped.
This would all be assuming that the you had all the correct retention options set in the nagios.cfg
One thing to note is that you would need to have nagios stopped when you copy over the retention.dat and then start the service, otherwise it would overwrite your changes before it stopped.
This would all be assuming that the you had all the correct retention options set in the nagios.cfg
-
- Posts: 5
- Joined: Wed Jul 31, 2013 9:48 am
Re: Comments and Downtime failover syncronization
@scottwilkerson - TY! You nailed it! The only change I had to make was stopping the failover nagios, then sync'ing the retention file. After starting the nagios process again all of the downtime and comments showed up. Our automated startup script does a verify and restart anyway, so no problem adding this into the process.