Page 1 of 1
Scheduled Downtime is still sending notifications
Posted: Tue Aug 25, 2015 9:24 am
by cesarpball
Hello nagios geeks!
I've recently upgraded one nagios box from 3.8 to 4.0.8
Yesterday and today we had patching interventions so I scheduled downtime all the services and the host, however, I was alerted just for one service when the service was down!
This happened too yesterday. Its quite odd. Any suggestions?
Code: Select all
[b][1440500468] SERVICE DOWNTIME ALERT: ad.stage;windows_services;STARTED; Service has entered a period of scheduled downtime[/b]
[1440502528] SERVICE ALERT: ad.stage;windows_services;CRITICAL;SOFT;1;CRITICAL:
[1440502558] SERVICE ALERT: ad.stage;windows_services;OK;SOFT;2;OK: All services are in their appropriate state.
[1440502852] SERVICE ALERT: ad.stage;windows_services;CRITICAL;SOFT;1;CRITICAL:
[1440502858] SERVICE ALERT: ad.stage;windows_services;CRITICAL;SOFT;1;CRITICAL:
[1440502882] SERVICE ALERT: ad.stage;windows_services;CRITICAL;SOFT;2;CRITICAL:
[1440502888] SERVICE ALERT: ad.stage;windows_services;CRITICAL;SOFT;2;CRITICAL:
[1440502913] SERVICE ALERT: ad.stage;windows_services;CRITICAL;SOFT;3;CRITICAL:
[1440502918] SERVICE ALERT: ad.stage;windows_services;CRITICAL;SOFT;3;CRITICAL:
[1440502942] SERVICE ALERT: ad.stage;windows_services;CRITICAL;HARD;4;CRITICAL:
[1440502948] SERVICE ALERT: ad.stage;windows_services;CRITICAL;HARD;4;CRITICAL:
[b][1440502948] SERVICE NOTIFICATION: notification-email-alert;ad.stage;windows_services;CRITICAL;notify-by-email;CRITICAL:[/b]
[1440503251] SERVICE ALERT: ad.stage;windows_services;OK;HARD;4;OK: All services are in their appropriate state.
[1440503253] SERVICE ALERT: ad.stage;windows_services;OK;HARD;4;OK: All services are in their appropriate state.
[1440503253] SERVICE NOTIFICATION: notification-email-alert;ad.stage;windows_services;OK;notify-by-email;OK: All services are in their appropriate state.
[b][1440509457] SERVICE DOWNTIME ALERT: ad.stage;windows_services;STOPPED; Service has exited from a period of scheduled downtime[/b]
Any help?
Re: Scheduled Downtime is still sending notifications
Posted: Tue Aug 25, 2015 1:06 pm
by tgriep
With the Duplicate Service Alerts happening seconds apart kind of points to that when the upgrade happened, the configs were duplicated.
Can you check the config settings and see if that happened?
Can you run the following and post the output here?
Code: Select all
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Re: Scheduled Downtime is still sending notifications
Posted: Wed Aug 26, 2015 10:39 am
by cesarpball
Thanks for your reply,
This is the output
Code: Select all
# /usr/bin/nagios -v /etc/nagios/nagios.cfg
Nagios Core 4.0.8
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2014
License: GPL
Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 100 services.
Checked 305 hosts.
Checked 299 host groups.
Checked 103 service groups.
Checked 707 contacts.
Checked 300 contact groups.
Checked 312 commands.
Checked 10 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 305 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 10 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
I am trying to identify any configuration file duplicated but I don't find it.
Re: Scheduled Downtime is still sending notifications
Posted: Wed Aug 26, 2015 5:06 pm
by ssax
Please post a sanitized copy of your service definition and any templates that it uses.
Re: Scheduled Downtime is still sending notifications
Posted: Fri Aug 28, 2015 3:26 am
by cesarpball
Code: Select all
define service {
service_description windows_services
display_name Windows Services
check_command check_nrpe!check_services
use generic-service
hostgroup_name windows
_criticality medium
}
I use this template for all the nagios that I have, but just its failing in one of them.
Code: Select all
# generic service template definition
define service{
use remote
name generic-service ; The 'name' of this service template
;active_checks_enabled 1 ; Active service checks are enabled
;passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
;obsess_over_service 1 ; We should obsess over this service (if necessary)
;check_freshness 1 ; Default is to NOT check service 'freshness'
;freshness_threshold 900
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_interval 15 ; Only send notifications on status change by default.
is_volatile 0
;check_period 24x7
normal_check_interval 10
retry_check_interval 1
max_check_attempts 4
notification_period 24x7
notification_options u,c,r,f
#contact_groups admins
_nrpecheck check_nrpe
_criticality normal
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
Today I got another alert:
Code: Select all
[1440745254] SERVICE DOWNTIME ALERT: windows_pet_serv;windows_services;STARTED; Service has entered a period of scheduled downtime
[1440745741] SERVICE ALERT: windows_pet_serv;windows_services;UNKNOWN;SOFT;1;CHECK_NRPE: Socket timeout after 30 seconds.
[1440745744] SERVICE ALERT: windows_pet_serv;windows_services;CRITICAL;SOFT;2;Connection refused by host
[1440745771] SERVICE ALERT: windows_pet_serv;windows_services;CRITICAL;SOFT;3;Connection refused by host
[1440745801] SERVICE ALERT: windows_pet_serv;windows_services;CRITICAL;HARD;4;Connection refused by host
[1440745818] SERVICE ALERT: windows_pet_serv;windows_services;CRITICAL;SOFT;1;Connection refused by host
[1440745863] SERVICE ALERT: windows_pet_serv;windows_services;CRITICAL;SOFT;2;CRITICAL: SERVICE1: stopped (critical), SERVICE2: stopped (critical), SERVICE3: sto pped (critical)
[1440745878] SERVICE ALERT: windows_pet_serv;windows_services;CRITICAL;SOFT;3;CRITICAL: SERVICE1: stopped (critical), SERVICE3: stopped (critical)
[1440745908] SERVICE ALERT: windows_pet_serv;windows_services;CRITICAL;HARD;4;CRITICAL: SERVICE1: stopped (critical), SERVICE3: stopped (critical)
[1440745908] SERVICE NOTIFICATION: internalmonitor-alert;windows_pet_serv;windows_services;CRITICAL;service-notify-by-email2;CRITICAL: SERVICE1: stopped (criti cal), UALSVC: stopped (critical)
[1440745908] SERVICE NOTIFICATION: internalmonitor-alert;windows_pet_serv;windows_services;CRITICAL;notify-by-email;CRITICAL: SERVICE1: stopped (critical), UALSVC: stopped (critical)
[1440746101] SERVICE ALERT: windows_pet_serv;windows_services;OK;HARD;4;OK: All services are in their appropriate state.
[1440746208] SERVICE ALERT: windows_pet_serv;windows_services;OK;HARD;4;OK: All services are in their appropriate state.
[1440746208] SERVICE NOTIFICATION: internalmonitor-alert;windows_pet_serv;windows_services;OK;service-notify-by-email2;OK: All services are in their appro priate state.
[1440746208] SERVICE NOTIFICATION: internalmonitor-alert;windows_pet_serv;windows_services;OK;notify-by-email;OK: All services are in their appropriate state.
Thanks very much for all your help!
Re: Scheduled Downtime is still sending notifications
Posted: Fri Aug 28, 2015 12:22 pm
by tgriep
When downtime is scheduled, receiving alerts are normal but you should not receive a notification (Email) during this time.
Did you receive an email when that service was down during the scheduled downtime?
Here is a quick description of Scheduled Downtime.
https://assets.nagios.com/downloads/nag ... ntime.html
Re: Scheduled Downtime is still sending notifications
Posted: Tue Sep 01, 2015 4:24 am
by cesarpball
Yep,
That's the problem. We are getting emails when the service is in scheduled downtime, but we are just getting alerts for one of the service of the monitor (not all the services)!
Re: Scheduled Downtime is still sending notifications
Posted: Tue Sep 01, 2015 11:49 am
by tgriep
There is another template called "remote" can you post that?
Re: Scheduled Downtime is still sending notifications
Posted: Wed Sep 02, 2015 3:11 am
by cesarpball
Hello,
This is the other one:
Code: Select all
# cat remote/service-remote.cfg
define service {
name remote
active_checks_enabled 1
passive_checks_enabled 1
check_period 24x7
check_freshness 0
_nagios_url https://nagios.myserver.net/nagios/cgi-bin/extinfo.cgi?type=2&
freshness_threshold 900
register 0
}
The problem could be for this one ??
Thanks
Re: Scheduled Downtime is still sending notifications
Posted: Wed Sep 02, 2015 3:43 pm
by tgriep
Your configs look good so far. Let's try and stop the nagios process to see if there is a stuck process still running.
Run this in a shell
Code: Select all
service nagios stop
killall -9 nagios
service nagios start
If this doesn't work, I will need all of the config files to find it.
You can PM the files to me if you like.