[Nagios-devel] [PATCH] Distinguish between warning and critical
Posted: Wed Nov 18, 2009 12:36 am
This is a multi-part message in MIME format.
--------------030002040103090605060909
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
This patch is to address the issue I asked about in this thread:
http://article.gmane.org/gmane.network. ... user/65141
Currently, Nagios does not distinguish between warnings and criticals in
service escalations. This can cause problems with escalation chains as
shown by this example.
define serviceescalation {
host hostname
service servicename
first_notification 3
last_notification 0
escalation_options c,u,r
}
Currently, a service that is in WARNING for 3 notifications, then enters
CRITICAL will match this service escalation. The behavior I am looking
for (and was expecting) is that after the 3rd critical or unknown, match
this escalation.
The attached patch (patches cleanly to 3.0.6 and HEAD as of yesterday)
adds the ability to specify service escalations that match after a
specified number of critical or warning notifications. IE:
define serviceescalation {
host hostname
service servicename
first_critical_notification 3
last_critical_notification 0
escalation_options c,u,r
}
The patch adds 4 configuration directives to service escalations
definitions:
first_warning_notification #
last_warning_notification #
first_critical_notification #
last_critical_notification #
Behavior is identical to (first|last)_notification, except that they
check against the count of warning/critical notifications instead of the
number of total notifications.
The behavior of the current directives is unchanged. Existing
deployments should not need to be modified with this patch applied.
I've run some tests for this patch off the 3.0.6 stable release, and it
seems to be working fine. Ran overnight without any complaints from the
logs, and the behavior is as I expect it to be.
Suggestions for improvements welcome.
-Gius
--------------030002040103090605060909
Content-Type: text/x-patch;
name="nagios_escalations.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="nagios_escalations.patch"
diff -ur nagiosclean/base/checks.c nagiospatched/base/checks.c
--- nagiosclean/base/checks.c 2009-08-11 09:56:39.000000000 -0700
+++ nagiospatched/base/checks.c 2009-11-16 11:28:45.000000000 -0800
@@ -1294,6 +1294,8 @@
temp_service->last_notification=(time_t)0;
temp_service->next_notification=(time_t)0;
temp_service->current_notification_number=0;
+ temp_service->current_warning_notification_number=0;
+ temp_service->current_critical_notification_number=0;
temp_service->problem_has_been_acknowledged=FALSE;
temp_service->acknowledgement_type=ACKNOWLEDGEMENT_NONE;
temp_service->notified_on_unknown=FALSE;
diff -ur nagiosclean/base/notifications.c nagiospatched/base/notifications.c
--- nagiosclean/base/notifications.c 2008-11-30 09:22:58.000000000 -0800
+++ nagiospatched/base/notifications.c 2009-11-17 15:28:45.000000000 -0800
@@ -98,10 +98,19 @@
/* should the notification number be increased? */
if(type==NOTIFICATION_NORMAL || (options & NOTIFICATION_OPTION_INCREMENT)){
svc->current_notification_number++;
+ /* also increment the warning/critical state counter */
+ if (svc->current_state == STATE_WARNING) {
+ svc->current_warning_notification_number++;
+ }
+ if (svc->current_state == STATE_CRITICAL) {
+ svc->current_critical_notification_number++;
+ }
increment_notification_number=TRUE;
}
log_debug_info(DEBUGL_NOTIFICATIONS,1,"Current notification number: %d (%s)\n",svc->current_notification_number,(increment_notification_number==TRUE)?"incremented":"unchanged");
+ log_debug_info(DEBUGL_NOTIFICATIONS,1,"Current warning notification number: %d (%s)\n",svc->current_warning_notification_number,(increment_notification_number==TRUE)?"incremented":"unchanged");
+ log_debug_info(DEBUGL_NOTIFICATIONS,1,"Current critical notification number: %d (%s)\n",svc->current_critical_notification_number,(increment_notification_number==TRUE)?"incremented":"unchanged");
/* save and increase the current notification id */
svc->cu
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
--------------030002040103090605060909
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
This patch is to address the issue I asked about in this thread:
http://article.gmane.org/gmane.network. ... user/65141
Currently, Nagios does not distinguish between warnings and criticals in
service escalations. This can cause problems with escalation chains as
shown by this example.
define serviceescalation {
host hostname
service servicename
first_notification 3
last_notification 0
escalation_options c,u,r
}
Currently, a service that is in WARNING for 3 notifications, then enters
CRITICAL will match this service escalation. The behavior I am looking
for (and was expecting) is that after the 3rd critical or unknown, match
this escalation.
The attached patch (patches cleanly to 3.0.6 and HEAD as of yesterday)
adds the ability to specify service escalations that match after a
specified number of critical or warning notifications. IE:
define serviceescalation {
host hostname
service servicename
first_critical_notification 3
last_critical_notification 0
escalation_options c,u,r
}
The patch adds 4 configuration directives to service escalations
definitions:
first_warning_notification #
last_warning_notification #
first_critical_notification #
last_critical_notification #
Behavior is identical to (first|last)_notification, except that they
check against the count of warning/critical notifications instead of the
number of total notifications.
The behavior of the current directives is unchanged. Existing
deployments should not need to be modified with this patch applied.
I've run some tests for this patch off the 3.0.6 stable release, and it
seems to be working fine. Ran overnight without any complaints from the
logs, and the behavior is as I expect it to be.
Suggestions for improvements welcome.
-Gius
--------------030002040103090605060909
Content-Type: text/x-patch;
name="nagios_escalations.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="nagios_escalations.patch"
diff -ur nagiosclean/base/checks.c nagiospatched/base/checks.c
--- nagiosclean/base/checks.c 2009-08-11 09:56:39.000000000 -0700
+++ nagiospatched/base/checks.c 2009-11-16 11:28:45.000000000 -0800
@@ -1294,6 +1294,8 @@
temp_service->last_notification=(time_t)0;
temp_service->next_notification=(time_t)0;
temp_service->current_notification_number=0;
+ temp_service->current_warning_notification_number=0;
+ temp_service->current_critical_notification_number=0;
temp_service->problem_has_been_acknowledged=FALSE;
temp_service->acknowledgement_type=ACKNOWLEDGEMENT_NONE;
temp_service->notified_on_unknown=FALSE;
diff -ur nagiosclean/base/notifications.c nagiospatched/base/notifications.c
--- nagiosclean/base/notifications.c 2008-11-30 09:22:58.000000000 -0800
+++ nagiospatched/base/notifications.c 2009-11-17 15:28:45.000000000 -0800
@@ -98,10 +98,19 @@
/* should the notification number be increased? */
if(type==NOTIFICATION_NORMAL || (options & NOTIFICATION_OPTION_INCREMENT)){
svc->current_notification_number++;
+ /* also increment the warning/critical state counter */
+ if (svc->current_state == STATE_WARNING) {
+ svc->current_warning_notification_number++;
+ }
+ if (svc->current_state == STATE_CRITICAL) {
+ svc->current_critical_notification_number++;
+ }
increment_notification_number=TRUE;
}
log_debug_info(DEBUGL_NOTIFICATIONS,1,"Current notification number: %d (%s)\n",svc->current_notification_number,(increment_notification_number==TRUE)?"incremented":"unchanged");
+ log_debug_info(DEBUGL_NOTIFICATIONS,1,"Current warning notification number: %d (%s)\n",svc->current_warning_notification_number,(increment_notification_number==TRUE)?"incremented":"unchanged");
+ log_debug_info(DEBUGL_NOTIFICATIONS,1,"Current critical notification number: %d (%s)\n",svc->current_critical_notification_number,(increment_notification_number==TRUE)?"incremented":"unchanged");
/* save and increase the current notification id */
svc->cu
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]