High Availability solutions for Nagios XI servers
High Availability solutions for Nagios XI servers
Our design is to have a primary Nagios XI server and a secondary Nagios XI server as a hot standby. In the case the primary Nagios XI server stops working for some reasons, we like to have the secondary Nagios XI server takes over to become the primary one with the same monitoring. How can we sync up these two Nagios XI servers so that when the secondary Nagios XI server becomes the primary one, it has the same set of monitors and configuration data (from mySQL DB, flat files, ..... e.g., for hosts, services, etc)? We can regularly run "backup_xi.sh" on the primary Nagios XI server to create a backup and copy it to the secondary Nagios XI server. Can we take this backup and run "restore_xi.sh" on the secondary Nagios XI server to "upload" the monitors, configuration data, as well as others needed to "sync up" with the primary Nagios XI server so that when the secondary becomes the primary, the same monitoring continues? Can this option is feasible? What may be broken or needs to take notes if we do a restore with a backup from one Nagios XI server to another Nagios XI server? Assuming these two Nagios XI servers are built the same. Are there other options that are not so complicated? Thanks in advance!
Re: High Availability solutions for Nagios XI servers
If you're a VMWare user, I hear they have a very nice 'High Availability' option for applications like this.Are there other options that are not so complicated?
If you have not taken a look at this already, I recommend watching the following video: https://youtu.be/KW5Qkl8brcA?t=3m29s
Some key points to consider for your type of setup:
-Primary XI server must send a scheduled backup to the secondary server daily.
-Your second XI server will be restoring from the backup of the primary, which can be initiated manually or automatically (via event handlers).
-All of your agents must be accessible by both Nagios XI servers.
-All passive agents must be configured to send to both Nagios XI servers.
Steps:
>(1) Deploy and initialize setup Primary XI Server.
>(2) Configure Primary XI Server(Monitoring settings and so on).
>(3) Deploy and initialize Secondary XI Server.
>(4) Configure the "Scheduled Backup Component" by ssh on Primary XI Server.
>(5) Add a host check for Primary XI Server on Secondary XI Server.
Options regarding how to fail over:
1. Do not run nagios on the secondary XI server (service nagios stop). Check the primary XI server with a cron job. Start services on the secondary server (service nagios start) only when the primary fails.
2. Disable active and passive checks on the secondary server and check the primary with a cron - when it is down, enable checks.
3. Disable notifications on secondary (allowing all checks to still run). When the primary is down, an event handler should be run turning on notifications.
In the above scenarios, a cron job would restore nagios XI daily, as opposed to just when the failure occurs.
I assume there will be questions about the above - send them my way and I'm happy to answer them.
Re: High Availability solutions for Nagios XI servers
Thank you for the response with very detailed and helpful information!
Our two Nagios XI servers will be on VM. I understand the VMWare can provide the High Availability capability to help us achive this goal. I am also evaluating other option for a high availability solution for Nagios XI servers if that work for us.
The presentation by Andy Brist on "High Availability and Failover Solutions for Nagios XI - Nagios" is excellent! Thanks for the recommendation. I am just wondering where I can get a copy of his PPT. It looks like we also need to copy additional files from the primary Nagios XI server to the secondary one, besides the backup (an output from "backup_xi.sh") taken on the primary Nagios XI server. Do you have these additional files? I saw some in Andy's talk.
I am going to take your input/suggestions and information in Andy's talk, and give it a try. I am sure I will have more questions. Can you have this post/ticket open for a few more days so that I can post additional questions as I learn more? Thanks!
Our two Nagios XI servers will be on VM. I understand the VMWare can provide the High Availability capability to help us achive this goal. I am also evaluating other option for a high availability solution for Nagios XI servers if that work for us.
The presentation by Andy Brist on "High Availability and Failover Solutions for Nagios XI - Nagios" is excellent! Thanks for the recommendation. I am just wondering where I can get a copy of his PPT. It looks like we also need to copy additional files from the primary Nagios XI server to the secondary one, besides the backup (an output from "backup_xi.sh") taken on the primary Nagios XI server. Do you have these additional files? I saw some in Andy's talk.
I am going to take your input/suggestions and information in Andy's talk, and give it a try. I am sure I will have more questions. Can you have this post/ticket open for a few more days so that I can post additional questions as I learn more? Thanks!
Re: High Availability solutions for Nagios XI servers
A PPT copy can be found here: http://www.slideshare.net/nagiosinc/nag ... -solutions
I am not sure what files you are referencing. Could you point me to the portion of the video that you're talking about?It looks like we also need to copy additional files from the primary Nagios XI server to the secondary one, besides the backup (an output from "backup_xi.sh") taken on the primary Nagios XI server. Do you have these additional files?
No problem at all, I will keep this thread open and we'll do our best to answer any questions you might have.Can you have this post/ticket open for a few more days so that I can post additional questions as I learn more?
Re: High Availability solutions for Nagios XI servers
Both our primary and secondary XI servers are installed with a host-based license, respectively. If we take a backup (result of backup_xi.sh) from the primary Nagios XI server, and copy it to the secondary Nagios XI server and run restore_xi.sh with this backup, does the Nagios XI license for the primary Nagios XI server get copied (restored) on the secondary Nagios XI server (also overwrite the Nagios XI license installed on the secondary Nagios XI)? Where is the Nagios XI license stored? database or flat file? Thanks!
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: High Availability solutions for Nagios XI servers
Yes on all points.xlin125 wrote: If we take a backup (result of backup_xi.sh) from the primary Nagios XI server, and copy it to the secondary Nagios XI server and run restore_xi.sh with this backup, does the Nagios XI license for the primary Nagios XI server get copied (restored) on the secondary Nagios XI server (also overwrite the Nagios XI license installed on the secondary Nagios XI)?
We don't get terribly excited about sharing the particulars, especially on the general forum.xlin125 wrote:Where is the Nagios XI license stored? database or flat file?
Re: High Availability solutions for Nagios XI servers
Thanks for the response!
So, if the Nagios XI license for the primary Nagios XI server get copied (restored) on the secondary Nagios XI server (also overwrite the Nagios XI license installed on the secondary Nagios XI), will this present any license issues on the secondary Nagios XI server?
So, if the Nagios XI license for the primary Nagios XI server get copied (restored) on the secondary Nagios XI server (also overwrite the Nagios XI license installed on the secondary Nagios XI), will this present any license issues on the secondary Nagios XI server?
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: High Availability solutions for Nagios XI servers
With your XI License, you are allowed three instances of it:
Production
Disaster Recovery
Test and Dev
These are tied to the IP Addresses of the XI servers you activate with. As long as the XI server you are using has an IP address that it has been activated with there will not be a problem.
Production
Disaster Recovery
Test and Dev
These are tied to the IP Addresses of the XI servers you activate with. As long as the XI server you are using has an IP address that it has been activated with there will not be a problem.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.