Regarding the HA configurarion
-
kgopiramesh
Regarding the HA configurarion
Hi All,
We are configuring the high availabilty cluster by master and slave configuration in our environment. We have few queries regarding the same.
1 ) If we enable the NCSA for outbound and inbound connections, would that sync updates of hosts, services, perfdata and notification contacts ( users and email address details ) to the slave server or do we need to discover from the slave node as well
2 ) if we are monitoring the master server from slave server and if any abnormal behaviour is found on the server, Is there any default external command or scripts ( event handlers ) is available to enable the host checks and service checks, alert mechanism from the slave
3) Please let us know if any additional steps needs to be carried out to achieve this.
We are configuring the high availabilty cluster by master and slave configuration in our environment. We have few queries regarding the same.
1 ) If we enable the NCSA for outbound and inbound connections, would that sync updates of hosts, services, perfdata and notification contacts ( users and email address details ) to the slave server or do we need to discover from the slave node as well
2 ) if we are monitoring the master server from slave server and if any abnormal behaviour is found on the server, Is there any default external command or scripts ( event handlers ) is available to enable the host checks and service checks, alert mechanism from the slave
3) Please let us know if any additional steps needs to be carried out to achieve this.
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Regarding the HA configurarion
1)Unfortunately, it will do none of these, with the exception of creation objects for hosts and services in unconfigured objects as they are passed to the slave server. There is the possibility to share the nagiosql db between systems and cause an apply config on the slave when the master updates, however this gets into other issues of maintaining that your slave is not also constantly doing active checks.
2) Not at this time, it is something we are looking at internally, but again not at this time.
3) This entirely depends on how you are planning on setting up HA, and what exactly you are looking to do. Without a much more detailed idea of what you intend to setup, we can't provide too many more caveats you might run into.
2) Not at this time, it is something we are looking at internally, but again not at this time.
3) This entirely depends on how you are planning on setting up HA, and what exactly you are looking to do. Without a much more detailed idea of what you intend to setup, we can't provide too many more caveats you might run into.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
kgopiramesh
Re: Regarding the HA configurarion
Hi,
Thank you so much for your reply, we are planning to implement HA by Weber approach (http://www.slideshare.net/nagiosinc/mike-weber-failover) slide no 5 & 6. So with this approach all the host details and services should be created in the slave but the data will be updated in the slave through NSCA. Am I correct?
Please clarify on below queries :
1 ) every time if we create a service to the master nagios server, do we need to create everytime to slave also or is it going to be replicated through NSCA.
2) Will the new users / templates / host groups etc created in master will be replicated in slave through NSCA?
3) Slide 22 from the ppt which I mentioned above would suffice the disabling & enabling notification in case of master goes up and down.
4) How the slave data will be updated to the master server
Thank you so much for your reply, we are planning to implement HA by Weber approach (http://www.slideshare.net/nagiosinc/mike-weber-failover) slide no 5 & 6. So with this approach all the host details and services should be created in the slave but the data will be updated in the slave through NSCA. Am I correct?
Please clarify on below queries :
1 ) every time if we create a service to the master nagios server, do we need to create everytime to slave also or is it going to be replicated through NSCA.
2) Will the new users / templates / host groups etc created in master will be replicated in slave through NSCA?
3) Slide 22 from the ppt which I mentioned above would suffice the disabling & enabling notification in case of master goes up and down.
4) How the slave data will be updated to the master server
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Regarding the HA configurarion
NSCA will do absolutely nothing to configure your slave server, short of providing you an (easier?) option through unconfigured objects if you wish to use it. So yes, you will either have to run the wizard, import configs, or otherwise to bring the same config to the slave server. This does not include anything with contacts, templates, or any for of configuration for nagios.Please clarify on below queries :
1 ) every time if we create a service to the master nagios server, do we need to create everytime to slave also or is it going to be replicated through NSCA.
2) Will the new users / templates / host groups etc created in master will be replicated in slave through NSCA?
Purely in terms of will this work, yes, is it ideal, no far from it. This will cause all checks on your systems to be run twice causing additional load on your monitored systems, and hence the point of forwarding with nsca. It would be far better to globally enable\disable notifications AND active checks when the the master changes states. Which it appears Mikes script should handle.3) Slide 22 from the ppt which I mentioned above would suffice the disabling & enabling notification in case of master goes up and down.
At the present time, there is not a great way to do this. Essentially I think you would need to spool outbound nsca notifications to the master server from the slave, so that it can process the checks and apply them accordingly. However this has the very real possibility to still leave gaps in rrd data, and be processed with incorrect times. Per the slides you mentioned, there would be no replication of data from slave to master, with the intention that the slave should only be master for a short amount of time.4) How the slave data will be updated to the master server
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
kgopiramesh
Re: Regarding the HA configurarion
Hi All,
I am having issues with the differences in the nagios xi and nagios.cfg file backend. We have to set up the nagios HA configuration and we have disabled the active checks from GUI but still in the nagios.cfg file, the active checks status is enabled. Please find the attached for the reference.
cat /usr/local/etc/nagios/nagios.cfg | grep -i execute
execute_host_checks=1
execute_service_checks=1
cat nagios.cfg | grep -i notifications
enable_notifications=1
log_notifications=1
How to disable and enable the active checks in nagios xi from the command line.
I am having issues with the differences in the nagios xi and nagios.cfg file backend. We have to set up the nagios HA configuration and we have disabled the active checks from GUI but still in the nagios.cfg file, the active checks status is enabled. Please find the attached for the reference.
cat /usr/local/etc/nagios/nagios.cfg | grep -i execute
execute_host_checks=1
execute_service_checks=1
cat nagios.cfg | grep -i notifications
enable_notifications=1
log_notifications=1
How to disable and enable the active checks in nagios xi from the command line.
You do not have the required permissions to view the files attached to this post.
Re: Regarding the HA configurarion
That is WAI as changing active check status while core is running (i.e. from the web ui, command pipe, etc) effects the status.dat file, not the configs.
http://nagios.sourceforge.net/docs/3_0/extcommands.html
http://old.nagios.org/developerinfo/ext ... ndlist.php
http://old.nagios.org/developerinfo/ext ... mand_id=68
http://old.nagios.org/developerinfo/ext ... mand_id=42
Code: Select all
grep "active_host\|active_service" /usr/local/nagios/var/status.datThis can be done by writing to the command pipe:kgopiramesh wrote:How to disable and enable the active checks in nagios xi from the command line.
http://nagios.sourceforge.net/docs/3_0/extcommands.html
http://old.nagios.org/developerinfo/ext ... ndlist.php
http://old.nagios.org/developerinfo/ext ... mand_id=68
http://old.nagios.org/developerinfo/ext ... mand_id=42
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
kgopiramesh
Re: Regarding the HA configurarion
Thanks for the reply, with the given below links I could achieve the disabling and enabling the active service & host checks.
I have configured the slave server services using the services in the unconfigured objects, now when I saw the service detail page for some services the active checks are disabled. How to enable the active checks for those services?
I have configured the slave server services using the services in the unconfigured objects, now when I saw the service detail page for some services the active checks are disabled. How to enable the active checks for those services?
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Regarding the HA configurarion
You would enable active checks like we just showed you... via the command pipe, through a scripted solution that runs when your other XI server fails. Or, you can activate them for each host/service through the CCM. Keep in mind, the server is not actively checking right now, you just added passive hosts/services, which means they are getting passive results back from the other XI server...
You are going to need to make sure every single host/service you are monitoring which required you to set up credentials for them so that nagios can connect (such as the XI server's address, etc, for NRPE checks) is also allowed for the second, failover server.
You are going to need to make sure every single host/service you are monitoring which required you to set up credentials for them so that nagios can connect (such as the XI server's address, etc, for NRPE checks) is also allowed for the second, failover server.
-
kgopiramesh
Re: Regarding the HA configurarion
Thanks for the reply. I understood how to enable and disable the active checks from the Nagios server.
but even the active checks are enabled for Nagios slave server, some services are saying that active checks are disabled for the particular service. please find the attached for the reference.
how to configure the active checks for the particular service from CCM, I am unable to view the option to do the same. I have even changed the service template to the linux_snmp_storage from the default added passive object.
The issue is resolved using advanced menu of the service detail page.
but even the active checks are enabled for Nagios slave server, some services are saying that active checks are disabled for the particular service. please find the attached for the reference.
how to configure the active checks for the particular service from CCM, I am unable to view the option to do the same. I have even changed the service template to the linux_snmp_storage from the default added passive object.
The issue is resolved using advanced menu of the service detail page.
You do not have the required permissions to view the files attached to this post.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Regarding the HA configurarion
If you disable active checks through the Advanced tab, this is a memory level configuration that will take precedence over any configuration changes.
The only way I know of to remove these memory items is to remove the retention.dat file while the nagios service is stopped (this will set everything to a pending state until new check results come in)
The only way I know of to remove these memory items is to remove the retention.dat file while the nagios service is stopped (this will set everything to a pending state until new check results come in)
Code: Select all
service nagios stop
rm -f /usr/local/nagios/var/retention.dat
service nagios start