Question about alerts
Posted: Sun Feb 05, 2012 6:43 pm
We had an outage last night, quite a strange one. The Nagios XI VM [VAULT21] (and a few others) lost connectivity to the network (and hence our monitoring stopped working). We had another Nagios XI VM that is specifically setup to monitor VAULT21 in case this happens and it sent out an Alert to inform us. Good stuff, we were onto the problem straight away.
So VAULT21 at this point could not see the network, if I logged onto the console I could not ping anything. The problem had to do with our ESXi host and nothing to do with Nagios.
However, we have VAULT21 setup to send alerts for some specific hosts. Because of the problem, it wanted to send a bunch of alerts however because it lost access to the network these alerts did not go out. Once the problem was resolved these alerts were then sent out.
Question 1)
Is there a command to run at the CLI that will disable notifications (like I can do via the tactical overview page ... I just couldn't get to it)? This way I could prevent all these alerts going out once I get the networking problem fixed.
Question 2)
Do the alerts sit in some sort of queue waiting for connectivity to the mail server to become available and then sends them out? How does it work when my scenario occurs?
Question 3)
Is there a way to prevent Nagios from sending alerts if the mail server is not contactable?
So VAULT21 at this point could not see the network, if I logged onto the console I could not ping anything. The problem had to do with our ESXi host and nothing to do with Nagios.
However, we have VAULT21 setup to send alerts for some specific hosts. Because of the problem, it wanted to send a bunch of alerts however because it lost access to the network these alerts did not go out. Once the problem was resolved these alerts were then sent out.
Question 1)
Is there a command to run at the CLI that will disable notifications (like I can do via the tactical overview page ... I just couldn't get to it)? This way I could prevent all these alerts going out once I get the networking problem fixed.
Question 2)
Do the alerts sit in some sort of queue waiting for connectivity to the mail server to become available and then sends them out? How does it work when my scenario occurs?
Question 3)
Is there a way to prevent Nagios from sending alerts if the mail server is not contactable?