bryceee wrote:Okay I added some extra DNS servers to my Nagios Server in the following locations /etc/networks/interfaces as it turns out that PERDC01 was the only DNS server configured for the Nagios server. This probably did not help.
Yeah that was my suspicion, the down DNS server it was using was causing the delay.
Moving on.
The reason why you are not getting alerts on reboot is because the host did not enter a hard state.
Here's an example:
Host
check_interval = 1
max_check_attempts = 3
retry_interval = 1
1:10pm - Host is checked and detected as UP, next check is 1.11pm
1.10 (and 30 seconds) pm - Host is rebooted, nagios does not know about it yet
1.11pm - Host check fails, retry interval is 1 so next attempt is 1.12pm (soft state)
1.12pm - Host check fails, retry interval is 1 so next attempt is 1.13pm (soft state)
1.12 (and 20 seconds) pm - Host is back up, nagios does not know about it yet
1.13pm - Host check succeeds, Host goes back into an OK state
The host would have entered a hard state at 1.13pm if it had failed the 3rd check attempt (max check attempts = 3).
If you want to know if a server has been rebooted, you can add an uptime check. This example will check if the system uptime is shorter than 1 day. Critical alert if system has been running for less than one day.
Code: Select all
Command:
check_nrpe -H 192.168.142.1 -t 30 -c CheckUpTime -a MinCrit=1d
Output:
CRITICAL: uptime: 0:21 < critical|'uptime'=1263000;0;86400000
Does this help / make sense?