Hello experts,
We have enterprise license for unlimited nodes for NagiosXI.
We are planning to setup high availability. Since our servers are on AWS, we are doing it as per the below mentioned plan,
We will build another server same as backup nagiosxi and install NagiosXI. Services on this will be kept in stopped state.
We will schedule cron for rsync every 10 minutes for Nagios related file from production server to Backup server so that all files are uptodate,
We will configure replication in mysql as Master(active) and slave(passive).
So when the Master server goes down. We will have to just trigger a script to start Nagios server.
My first question is
Will the licensing work on the backup Nagios server if we configure same ip address as Master server after the master goes down?
My second question:
Is it good solution? If you have any better solution on AWS please let us know.
Third question :
Suppose one of the service was already down and Master goes down and we switch to backup server. Will the backup server check the service again and send the alert and trigger the event?
Sorry for big question.
Please let me know I have written something which you are not able to get it?
Thanks a lot for your help
License question
Re: License question
Please contact [email protected] regarding licensing questions.My first question is
Will the licensing work on the backup Nagios server if we configure same ip address as Master server after the master goes down?
We generally don't support HA environments, we pass them on to a partner, Linbit, who handles something like that - https://www.nagios.com/news/2015/10/pre ... nagios-xi/My second question:
Is it good solution? If you have any better solution on AWS please let us know.
From a technical stand point, what you mentioned sounds correct. You'll want to make sure that they are not executing checks at the same time, to the same database though, as you could see corruption caused if so. SQL + the flat files are the most important part to have replicated.
I guess this really comes down to timing, and if initial_state is set. Since you're effectively doing a restart, the status.dat should carry the information needed for it to follow, but it would depend on the setup specifically. It should still respect check interval's though and notify accordingly.Third question :
Suppose one of the service was already down and Master goes down and we switch to backup server. Will the backup server check the service again and send the alert and trigger the event?
Former Nagios Employee