Query scaling nagios Core

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
alex3105
Posts: 103
Joined: Sat Jul 28, 2018 10:54 am

Re: Query scaling nagios Core

Post by alex3105 »

Dear Ssax,

Could you please indicate the configuration so that the notifications arrive according to the escalation ... every 30 minutes to the security area when the host is inactive and every 60 escalation notifications to the other areas.

Greetings.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Query scaling nagios Core

Post by ssax »

These were already posted and should do what you want:

This one notifies seguridad every 30 minutes from the first notification onward:

Code: Select all

define hostescalation {
    hostgroup_name      FW
    first_notification      1
    last_notification       0
    notification_interval   30
    escalation_period  24x7
    escalation_options d,r
    contact_groups          seguridad
}
This one notifies infraestructura and soporte every 60 minutes from the 2nd notification onward (I used the 2nd notification because the notification_interval on the service is set to 30 (so 30*2=60 for the first notification)):

Code: Select all

define hostescalation {
    hostgroup_name      FW
    first_notification      2
    last_notification       0
    notification_interval   60
    escalation_period  24x7
    escalation_options d,r
    contact_groups          infraestructura,soporte
}
One thing you need to remember is this:

Because the notifications only occur when a check is in process, it matters what you set your check_interval and notification_interval to. For example, assuming these settings for your host:

Code: Select all

define host {
	use			linux-server,host-pnp
	host_name		MGMT
	alias			MGMT
	address			X.X.X.X
	check_command check-host-alive
	check_interval	2
	check_period 24x7
	max_check_attempts	3
	retry_interval 1	
	notification_interval 30
	notifications_enabled	1
	notification_period	24x7
	notification_options	d,u,r,s	
	contact_groups    seguridad                           	
}
The following would occur:
- Assume each of these is a Host Down result

+00:00 - Check #1 - Initial Host Down - (1 of 3 max_check_attempts) (SOFT state) (Notification: No)

+00:01 - Check #2 - retry_interval (one minute later), Host Down 2 of 3, SOFT state, (Notification: No)

+00:02 - Check #3 - Escalation 1 hit - retry_interval (one minute later), Host Down 3 of 3, HARD state, notification sent to seguridad - (Notification 1 - seguridad, 2 minutes since first problem detected)

+00:04 - Check #4 - falls back to check_interval of 2 minutes, Host Down, HARD state, notification should try to be sent but... the first notification was sent less that notification_interval (30) ago so no notification will be sent

+00:06 - Check #5 - falls back to check_interval of 2 minutes - (Notification: No)

+00:08 - Check #6 - falls back to check_interval of 2 minutes - (Notification: No)

+00:10 - Check #7 - falls back to check_interval of 2 minutes - (Notification: No)

+00:12 - Check #8 - falls back to check_interval of 2 minutes - (Notification: No)

+00:14 - Check #9 - falls back to check_interval of 2 minutes - (Notification: No)

+00:16 - Check #10 - falls back to check_interval of 2 minutes - (Notification: No)

+00:18 - Check #11 - falls back to check_interval of 2 minutes - (Notification: No)

+00:20 - Check #12 - falls back to check_interval of 2 minutes - (Notification: No)

+00:22 - Check #13 - falls back to check_interval of 2 minutes - (Notification: No)

+00:24 - Check #14 - falls back to check_interval of 2 minutes - (Notification: No)

+00:28 - Check #15 - falls back to check_interval of 2 minutes - (Notification: No)

+00:30 - Check #16 - Escalation 1 (again) - If this really fell on 00:30 exactly you would get the second notification to seguridad but if the scheduler put it a little before then you would not have exceeded the 30 minute notification_interval and the notification would be sent on the next check at +00:32. - (Notification 2 - seguridad)

+00:32 - Check #17 - falls back to check_interval of 2 minutes - (Notification: No)

+00:14 - Check #18 - falls back to check_interval of 2 minutes - (Notification: No)

...

...

...

...

+00:60 - Escalation 1 (again) | Escalation 2 (1st) - (Notification 3 - seguridad && Notification 1 for infraestructura AND soporte)
Locked