Not Getting Windows Server Reboot Notifications

bryceee · Post by **bryceee** » Thu Dec 04, 2014 2:30 am

is contact_groups still valid or has it been replaced?

Okay I noticed that the check-host-alive uses the check_ping command.
When I run the following manually I get a response

Code: Select all

root@PERNAGIOS01:/usr/local/nagios/libexec# ./check_ping -H 10.41.30.32 -w 3000.0,80% -c 5000.0,100% -p 5
PING OK - Packet loss = 0%, RTA = 2.67 ms|rta=2.675000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0

I do notice that the check_dhcp and check_icmp are in red blocks with white writing, would this cause and issue?

bryceee · Post by **bryceee** » Thu Dec 04, 2014 3:26 am

I do have a ping_service.cfg file which is configured as follows.
Would this cause and issues with the check-host-alive?

Code: Select all

# check that ping-only hosts are up

define service {
        hostgroup_name                  *
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        use                             generic-service
        notification_interval           30 ; set > 0 if you want to be renotified
        normal_check_interval           1
}

sreinhardt · Post by **sreinhardt** » Thu Dec 04, 2014 5:15 pm

contact_groups and check_dhcp\icmp being suid should be just fine. If you bring that host down, and tail the nagios log, are you seeing notification alerts in the log?

Code: Select all

tail -f /usr/local/nagios/var/nagios.log | grep -i 'notification'
tail -f /var/log/messages | grep -i 'notification'

bryceee · Post by **bryceee** » Thu Dec 04, 2014 10:33 pm

Okay I added some extra DNS servers to my Nagios Server in the following locations /etc/networks/interfaces as it turns out that PERDC01 was the only DNS server configured for the Nagios server. This probably did not help.

When I Shutdown the Server, after it had been down for 5 minutes I got the alert.

Code: Select all

root@PERNAGIOS01:~# tail -f /usr/local/nagios/var/nagios.log | grep -i 'notification'
[1417749266] HOST NOTIFICATION: nagiosadmin;PERDC01;DOWN;notify-host-by-email;CRITICAL - Host Unreachable (10.41.30.31)

Powered on the server and received and alert 1 minuted later.

I rebooted the Server the and did not receive an alert. How can I make it alert me on a reboot.
Am I right in thinking its to to with the -w ,-c and -p 5 in the check-host-alive command?
Are there any suggested settings for this?

Code: Select all

define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
        }

I just want to say thank you to everyone who has been assisting me with the ongoing issue.

Post by **Box293** » Thu Dec 04, 2014 11:48 pm

bryceee wrote:Okay I added some extra DNS servers to my Nagios Server in the following locations /etc/networks/interfaces as it turns out that PERDC01 was the only DNS server configured for the Nagios server. This probably did not help.

Yeah that was my suspicion, the down DNS server it was using was causing the delay.

Moving on.

The reason why you are not getting alerts on reboot is because the host did not enter a hard state.
Here's an example:

Host
check_interval = 1
max_check_attempts = 3
retry_interval = 1

1:10pm - Host is checked and detected as UP, next check is 1.11pm
1.10 (and 30 seconds) pm - Host is rebooted, nagios does not know about it yet
1.11pm - Host check fails, retry interval is 1 so next attempt is 1.12pm (soft state)
1.12pm - Host check fails, retry interval is 1 so next attempt is 1.13pm (soft state)
1.12 (and 20 seconds) pm - Host is back up, nagios does not know about it yet
1.13pm - Host check succeeds, Host goes back into an OK state

The host would have entered a hard state at 1.13pm if it had failed the 3rd check attempt (max check attempts = 3).

If you want to know if a server has been rebooted, you can add an uptime check. This example will check if the system uptime is shorter than 1 day. Critical alert if system has been running for less than one day.

Code: Select all

Command:
check_nrpe -H 192.168.142.1 -t 30 -c CheckUpTime -a MinCrit=1d

Output:
CRITICAL: uptime: 0:21 < critical|'uptime'=1263000;0;86400000

Does this help / make sense?

bryceee · Post by **bryceee** » Fri Dec 05, 2014 12:19 am

okay that is making sense, thank you for taking time to explain it.

is it worth while adding in the retry_interval = 1 to the generic-host template? Mine looks like this.

Code: Select all


define host{
        name                            generic-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
        notification_period             24x7            ; Send host notifications at any time
                check_command           check-host-alive
                max_check_attempts      3
                check_interval          1
                notification_interval   30
                notification_options    d,u,r
                contact_groups          admins
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

So the code below will tell me if a server has rebooted in a day, but will it alert me while its rebooting?

Code: Select all

Command:
check_nrpe -H 192.168.142.1 -t 30 -c CheckUpTime -a MinCrit=1d

Output:
CRITICAL: uptime: 0:21 < critical|'uptime'=1263000;0;86400000

Post by **Box293** » Fri Dec 05, 2014 12:48 am

No worries.

bryceee wrote:is it worth while adding in the retry_interval = 1 to the generic-host template?

Any settings in a template will be overwritten if they are defined in the host/service object itself.

bryceee wrote:but will it alert me while its rebooting?

If the service you setup for this uptime check has similar check intervals then you will be notified within a couple of minutes of the reboot.

bryceee · Post by **bryceee** » Fri Dec 05, 2014 1:38 am

I'll give it a try and see what happens.

Apparently our old system used to alert with in a few seconds, but the generic host configs are the same.

Just for my sake, what does this mean
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5

specifically the -w -c and -p

bryceee · Post by **bryceee** » Fri Dec 05, 2014 3:18 am

Just for my sake, what does this mean
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5

specifically the -w -c and -p

Just did a looked at the man page and found my answer on what that means.

Post by **Box293** » Fri Dec 05, 2014 4:07 am

Great, let us know how you go. The uptime check is a guaranteed notification that the server was rebooted.

Nagios Support Forum

Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications

Re: Not Getting Windows Server Reboot Notifications