Page 1 of 1

Issue with manual and scheduled suppression

Posted: Fri Mar 18, 2016 6:08 am
by pubstars
Hello all,

We are currently running Nagios Core 4.0.8 in our live environment and currently ran into a issue where a user scheduled suppression, then manual suppression was applied and these cancelled each other out. I ran some tests and could reproduce the bug on our dev system also. Below are the logs with headings.

I did 3 tests this morning trying to replicate this. Here are the steps from one of those tests.
Downtime Scheduled
[1455193232] EXTERNAL COMMAND: SCHEDULE_HOST_DOWNTIME;<hostname>;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;Temperature-Check;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;Humidity-Check;1455193800;1455199200;1;0;7200;<user>;xvdfv
[1455193232] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;<hostname>;ping;1455193800;1455199200;1;0;7200;<user>;xvdfv

Device enters period of scheduled downtime
[1455193799] HOST DOWNTIME ALERT: <hostname>;STARTED; Host has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;ping;STARTED; Service has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;Temperature-Check;STARTED; Service has entered a period of scheduled downtime
[1455193799] SERVICE DOWNTIME ALERT: <hostname>;Humidity-Check;STARTED; Service has entered a period of scheduled downtime

Device monitoring manually disabled
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455195893] EXTERNAL COMMAND: DISABLE_HOST_NOTIFICATIONS;<hostname>

Device goes offline and starts to alert
[1455195954] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455195964] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196011] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196038] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;1;UNKNOWN - Can't establish SNMP connection
[1455196067] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 100%
[1455196108] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 100%
[1455196134] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196144] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196190] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196217] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;SOFT;2;UNKNOWN - Can't establish SNMP connection
[1455196247] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;2;PING CRITICAL - Packet loss = 100%
[1455196288] SERVICE ALERT: <hostname>;ping;CRITICAL;SOFT;2;PING CRITICAL - Packet loss = 100%
[1455196313] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196324] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196371] SERVICE ALERT: <hostname>;Temperature-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196398] SERVICE ALERT: <hostname>;Humidity-Check;UNKNOWN;HARD;3;UNKNOWN - Can't establish SNMP connection
[1455196427] SERVICE ALERT: <hostname>;ping;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 100%
[1455196468] SERVICE ALERT: <hostname>;ping;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 100%

High severity TT cut
[1455196468] SERVICE NOTIFICATION: cut-ticket-sev2-criticial;<hostname>;ping;CRITICAL;cut-service-tt;PING CRITICAL - Packet loss = 100%
[1455196474] EXTERNAL COMMAND: ADD_SVC_COMMENT;<hostname>;ping;1;Nagios;<webb address to cut TT too>

Manual monitoring enabled
[1455196823] EXTERNAL COMMAND: ENABLE_HOST_SVC_NOTIFICATIONS;<hostname>
[1455196823] EXTERNAL COMMAND: ENABLE_HOST_NOTIFICATIONS;<hostname>

Scheduled downtime cancelled
[1455196848] HOST DOWNTIME ALERT: <hostname>;CANCELLED; Scheduled downtime for host has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;ping;CANCELLED; Scheduled downtime for service has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;Humidity-Check;CANCELLED; Scheduled downtime for service has been cancelled.
[1455196848] SERVICE DOWNTIME ALERT: <hostname>;Temperature-Check;CANCELLED; Scheduled downtime for service has been cancelled.

Device brought back online
[1455197028] SERVICE ALERT: <hostname>;Temperature-Check;OK;HARD;3;TEMPERATURE OK: 2 sensor(s) in normal scope
[1455197038] SERVICE ALERT: <hostname>;Humidity-Check;OK;HARD;3;HUMIDITY OK: 2 sensor(s) in normal scope
[1455197084] SERVICE ALERT: <hostname>;Temperature-Check;OK;HARD;3;TEMPERATURE OK: 2 sensor(s) in normal scope
[1455197111] SERVICE ALERT: <hostname>;Humidity-Check;OK;HARD;3;HUMIDITY OK: 2 sensor(s) in normal scope
[1455197241] SERVICE ALERT: <hostname>;ping;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 27.15 ms
[1455197282] SERVICE ALERT: <hostname>;ping;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 28.65 ms

Regards

Re: Issue with manual and scheduled suppression

Posted: Fri Mar 18, 2016 3:05 pm
by ssax
Looks like you found a bug, I'll lab it up on Monday and let you know if I can replicate it before we submit it.

What Linux distro and version are you running as well?

Code: Select all

uname -a
Thank you

Re: Issue with manual and scheduled suppression

Posted: Wed Mar 30, 2016 7:31 am
by pubstars
Please see below:-

Linux <hostname> 3.2.45-0.6.acc.624.45.283.<org>1acc.x86_64 #1 SMP Fri Nov 21 22:39:25 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Re: Issue with manual and scheduled suppression

Posted: Wed Mar 30, 2016 3:40 pm
by hsmith
What's the output of this command?

Code: Select all

cat /etc/*release*

Re: Issue with manual and scheduled suppression

Posted: Fri Apr 08, 2016 5:21 am
by pubstars
Red Hat Enterprise Linux Server release 5.3 (Tikanga)
<company> Linux Bare Metal release 2012.03
cpe:/o:<company>:linux:2012.03:ga

Re: Issue with manual and scheduled suppression

Posted: Fri Apr 08, 2016 2:45 pm
by ssax
Do you still have that full log during that timeperiod still (can check under your /usr/local/nagios/var/archive/)? We would like to take a look at it for reference.

Re: Issue with manual and scheduled suppression

Posted: Mon Apr 18, 2016 9:01 am
by pubstars
Hello,

No we don't have the logs for the time period, they are flushed once a month.

Re: Issue with manual and scheduled suppression

Posted: Mon Apr 18, 2016 5:11 pm
by tmcdonald
Is this something you can replicate? Without the log it'll be hard to compare notes, so to speak. If you can replicate the behavior and share that log we can see about testing this on our end.

Re: Issue with manual and scheduled suppression

Posted: Wed May 11, 2016 9:48 am
by pubstars
I could replicate it and the logs from the test are above. I will try and replicate this again over the next week.

Re: Issue with manual and scheduled suppression

Posted: Wed May 11, 2016 4:52 pm
by hsmith
Let us know what happens. Thanks!