Page 1 of 1

Send alterts if service is down for more than 10 mins

Posted: Mon Nov 07, 2016 10:35 am
by kaushalshriyan
Hi,

I have defined a service in templates.cfg file "HAProxy-service". I am running Nagios 4.2.0 version in my setup.

define service{
name HAProxy-service ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
is_volatile 0 ; The service is not volatile
check_period 24x7 ; The service can be checked at any time of the day
max_check_attempts 5 ; Re-check the service up to 5 times in order to determine its final (hard) state
normal_check_interval 3 ; Check the service every 3 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every one minutes until a hard state can be determined
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 60 ; Re-notify about service problems every hour
notification_period 24x7 ; Notifications can be sent out at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}

Is the below settings correct if the service is down for more than 10 mins?

max_check_attempts 5 ; Re-check the service up to 5 times in order to determine its final (hard) state
normal_check_interval 3 ; Check the service every 3 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every one minutes until a hard state can be determined


Please suggest. Any help will be highly appreciable. Thanks in Advance.

Regards,

Kaushal

Re: Send alterts if service is down for more than 10 mins

Posted: Mon Nov 07, 2016 3:25 pm
by tgriep
With the settings you have,

Code: Select all

max_check_attempts 5 ; Re-check the service up to 5 times in order to determine its final (hard) state
normal_check_interval 3 ; Check the service every 3 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every one minutes until a hard state can be determined
When the Nagios system detect that the service is down for 5 minutes "the retry_check_interval multiplied by max_check_attempts" the system will send out a notification.
If you want the notification to happen at 10 minutes, you would change max_check_attempts to 10.

Re: Send alterts if service is down for more than 10 mins

Posted: Mon Nov 07, 2016 5:06 pm
by kaushalshriyan
Thanks tgriep for the reply. Please let me know about normal_check_interval 3 ; Check the service every 3 minutes under normal conditions ?

Is the below setting correct as per your recommendation?
max_check_attempts 10 ; Re-check the service up to 10 times in order to determine its final (hard) state
normal_check_interval 3 ; Check the service every 3 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every one minutes until a hard state can be determined
Thanks again in Advance. Please suggest about the parameter normal_check_interval 3

Regards,

Kaushal

Re: Send alterts if service is down for more than 10 mins

Posted: Mon Nov 07, 2016 5:14 pm
by tgriep
The check_interval is the interval of when the Nagios Process checks that service when that check is OK. Your settings for that service are set to run every three minutes.
Take a look at this link that describes all of the Object Definitions for hosts, services, etc...
https://assets.nagios.com/downloads/nag ... tions.html