Issue with manual and scheduled suppression
Posted: Fri Mar 18, 2016 6:08 am
Hello all,
We are currently running Nagios Core 4.0.8 in our live environment and currently ran into a issue where a user scheduled suppression, then manual suppression was applied and these cancelled each other out. I ran some tests and could reproduce the bug on our dev system also. Below are the logs with headings.
I did 3 tests this morning trying to replicate this. Here are the steps from one of those tests.
Downtime Scheduled
[1455193232] EXTERNAL COMMAND: SCHEDULE_HOST_DOWNTIME;<hostname>;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;Temperature-Check;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;Humidity-Check;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;ping;1455193800;1455199200;1;0;7200;<user>;xvdfv
Device enters period of scheduled downtime
[1455193799] HOST DOWNTIME ALERT: <hostname>;STARTED; Host has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;ping;STARTED; Service has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;Temperature-Check;STARTED; Service has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;Humidity-Check;STARTED; Service has entered a period of scheduled downtime
Device monitoring manually disabled
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_NOTIFICATIONS;<hostname>
Device goes offline and starts to alert
[1455195954] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455195964] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196011] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196038] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196067] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 100%
[1455196108] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 100%
[1455196134] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196144] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196190] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196217] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196247] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;2;PING CRITICAL - Packet loss = 100%
[1455196288] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;2;PING CRITICAL - Packet loss = 100%
[1455196313] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196324] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196371] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196398] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196427] SERVICE ALERT: <hostname>;ping;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 100%
[1455196468] SERVICE ALERT: <hostname>;ping;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 100%
High severity TT cut
[1455196468] SERVICE NOTIFICATION: cut-ticket-sev2-criticial;<hostname>;ping;CRITICAL;cut-service-tt;PING CRITICAL - Packet loss = 100%
[1455196474] EXTERNAL COMMAND: ADD_SVC_COMMENT;<hostname>;ping;1;Nagios;<webb address to cut TT too>
Manual monitoring enabled
[1455196823] EXTERNAL COMMAND: ENABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455196823] EXTERNAL COMMAND: ENABLE_HOST_NOTIFICATIONS;<hostname>
Scheduled downtime cancelled
[1455196848] HOST DOWNTIME ALERT: <hostname>;CANCELLED; Scheduled downtime for host has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;ping;CANCELLED; Scheduled downtime for service has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;Humidity-Check;CANCELLED; Scheduled downtime for service has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;Temperature-Check;CANCELLED; Scheduled downtime for service has been cancelled.
Device brought back online
[1455197028] SERVICE ALERT: <hostname>;Temperature-Check;OK;HARD;3;TEMPERATURE OK: 2 sensor(s) in normal scope
[1455197038] SERVICE ALERT: <hostname>;Humidity-Check;OK;HARD;3;HUMIDITY OK: 2 sensor(s) in normal scope
[1455197084] SERVICE ALERT: <hostname>;Temperature-Check;OK;HARD;3;TEMPERATURE OK: 2 sensor(s) in normal scope
[1455197111] SERVICE ALERT: <hostname>;Humidity-Check;OK;HARD;3;HUMIDITY OK: 2 sensor(s) in normal scope
[1455197241] SERVICE ALERT: <hostname>;ping;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 27.15 ms
[1455197282] SERVICE ALERT: <hostname>;ping;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 28.65 ms
Regards
We are currently running Nagios Core 4.0.8 in our live environment and currently ran into a issue where a user scheduled suppression, then manual suppression was applied and these cancelled each other out. I ran some tests and could reproduce the bug on our dev system also. Below are the logs with headings.
I did 3 tests this morning trying to replicate this. Here are the steps from one of those tests.
Downtime Scheduled
[1455193232] EXTERNAL COMMAND: SCHEDULE_HOST_DOWNTIME;<hostname>;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;Temperature-Check;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;Humidity-Check;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;ping;1455193800;1455199200;1;0;7200;<user>;xvdfv
Device enters period of scheduled downtime
[1455193799] HOST DOWNTIME ALERT: <hostname>;STARTED; Host has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;ping;STARTED; Service has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;Temperature-Check;STARTED; Service has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;Humidity-Check;STARTED; Service has entered a period of scheduled downtime
Device monitoring manually disabled
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_NOTIFICATIONS;<hostname>
Device goes offline and starts to alert
[1455195954] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455195964] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196011] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196038] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196067] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 100%
[1455196108] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 100%
[1455196134] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196144] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196190] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196217] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196247] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;2;PING CRITICAL - Packet loss = 100%
[1455196288] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;2;PING CRITICAL - Packet loss = 100%
[1455196313] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196324] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196371] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196398] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196427] SERVICE ALERT: <hostname>;ping;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 100%
[1455196468] SERVICE ALERT: <hostname>;ping;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 100%
High severity TT cut
[1455196468] SERVICE NOTIFICATION: cut-ticket-sev2-criticial;<hostname>;ping;CRITICAL;cut-service-tt;PING CRITICAL - Packet loss = 100%
[1455196474] EXTERNAL COMMAND: ADD_SVC_COMMENT;<hostname>;ping;1;Nagios;<webb address to cut TT too>
Manual monitoring enabled
[1455196823] EXTERNAL COMMAND: ENABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455196823] EXTERNAL COMMAND: ENABLE_HOST_NOTIFICATIONS;<hostname>
Scheduled downtime cancelled
[1455196848] HOST DOWNTIME ALERT: <hostname>;CANCELLED; Scheduled downtime for host has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;ping;CANCELLED; Scheduled downtime for service has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;Humidity-Check;CANCELLED; Scheduled downtime for service has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;Temperature-Check;CANCELLED; Scheduled downtime for service has been cancelled.
Device brought back online
[1455197028] SERVICE ALERT: <hostname>;Temperature-Check;OK;HARD;3;TEMPERATURE OK: 2 sensor(s) in normal scope
[1455197038] SERVICE ALERT: <hostname>;Humidity-Check;OK;HARD;3;HUMIDITY OK: 2 sensor(s) in normal scope
[1455197084] SERVICE ALERT: <hostname>;Temperature-Check;OK;HARD;3;TEMPERATURE OK: 2 sensor(s) in normal scope
[1455197111] SERVICE ALERT: <hostname>;Humidity-Check;OK;HARD;3;HUMIDITY OK: 2 sensor(s) in normal scope
[1455197241] SERVICE ALERT: <hostname>;ping;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 27.15 ms
[1455197282] SERVICE ALERT: <hostname>;ping;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 28.65 ms
Regards