recovery email being sent while host in recurring downtime

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
micdud
Posts: 7
Joined: Tue Nov 28, 2017 3:29 pm

recovery email being sent while host in recurring downtime

Post by micdud »

Hi,
i noticed this after upgrading to 5.5.x.

Host is in recurring downtime goes down. all good no email notification and then machines comes back up while still in recurring downtime and we are getting recovery email.
how can i make this behavior stop.

Mike
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: recovery email being sent while host in recurring downti

Post by cdienger »

What version are you on now? 5.5.2 resolved some problems with recurring downtime. Upgrade to this version if the machine ins't already there and let us us know if the behavior continues.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
micdud
Posts: 7
Joined: Tue Nov 28, 2017 3:29 pm

Re: recovery email being sent while host in recurring downti

Post by micdud »

upgrading to 5.5.2 did fix a lot of recurring downtime issues but not this one.
we are getting recovery email while machine is still in recurring downtime.
to be clear machine goes into recurring downtime (scheduled downtime). after that we reboot machine (no notification about machine going down) so far so good. while machine still in recurring/scheduled downtime after machine comes back online we are getting ping recovery notification which should of been suppressed due to recurring/scheduled downtime.
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: recovery email being sent while host in recurring downti

Post by lmiltchev »

Can you show us the actual recovery email notification that you received?

Run the following commands and show the output:

Code: Select all

/usr/local/nagios/bin/nagios -V
grep -i '<hostname>' /usr/local/nagios/var/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
where you substitute <hostname> with the actual hostname of the "problem" host.
Be sure to check out our Knowledgebase for helpful articles and solutions!
micdud
Posts: 7
Joined: Tue Nov 28, 2017 3:29 pm

Re: recovery email being sent while host in recurring downti

Post by micdud »

im not sure why this host has ping as service instead host check but either way it shouldnt alert.

output:
grep -i 'win_sql_server' /usr/local/nagios/var/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
[Fri Aug 10 00:00:00 2018] CURRENT HOST STATE: win_sql_server;UP;HARD;1;OK - 10.226.165.51: rta 122.255ms, lost 0%
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;CPU Usage 80/90;OK;HARD;1;CPU Load 0% (5 min average)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Drive C: Disk Usage 80/95;OK;HARD;1;C:\ - total: 39.66 Gb - used: 25.53 Gb (64%) - free 14.12 Gb (36%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Drive E: Disk Usage 90/95;OK;HARD;1;E:\ - total: 40.00 Gb - used: 32.26 Gb (81%) - free 7.74 Gb (19%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Memory Usage 90/95;OK;HARD;1;Memory usage: total:17262.92 MB - used: 14687.72 MB (85%) - free: 2575.21 M/usr/local/nagios/bin/nagios -VB (15%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;NSClient Status;OK;HARD;1;OK: All services are in their appropriate state.
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Ping;OK;HARD;1;OK - 10.226.165.51: rta 122.237ms, lost 0%
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;SQL Core Services;OK;HARD;1;sqlserveragent: Started - mssqlserver: Started
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Uptime;OK;HARD;1;System Uptime - 272 day(s) 5 hour(s) 15 minute(s)
[Fri Aug 10 13:14:59 2018] HOST DOWNTIME ALERT: win_sql_server;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;CPU Usage 80/90;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Drive C: Disk Usage 80/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Drive E: Disk Usage 90/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Memory Usage 90/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;NSClient Status;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Ping;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:15:00 2018] SERVICE DOWNTIME ALERT: win_sql_server;SQL Core Services;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:15:00 2018] SERVICE DOWNTIME ALERT: win_sql_server;Uptime;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 14:25:30 2018] SERVICE ALERT: win_sql_server;Ping;CRITICAL;SOFT;1;CRITICAL - 10.226.165.51: rta 650.163ms, lost 0%
[Fri Aug 10 14:27:30 2018] SERVICE ALERT: win_sql_server;Ping;CRITICAL;HARD;3;CRITICAL - 10.226.165.51: rta 672.770ms, lost 0%
[Fri Aug 10 14:32:24 2018] SERVICE NOTIFICATION: prod_sql;win_sql_server;Ping;OK;notify-service-by-email;OK - 10.226.165.51: rta 122.226ms, lost 0%
[Fri Aug 10 14:32:24 2018] SERVICE ALERT: win_sql_server;Ping;OK;HARD;1;OK - 10.226.165.51: rta 122.226ms, lost 0%


/usr/local/nagios/bin/nagios -V
Nagios Core 4.4.1
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2018-06-25
License: GPL

Website: https://www.nagios.org
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License version 2 as
published by the Free Software Foundation.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: recovery email being sent while host in recurring downti

Post by cdienger »

I'm currently working on reproducing this error and would appreciate if you could PM me a profile(Admin > System Config > System Profile > Download System Profile).
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: recovery email being sent while host in recurring downti

Post by lmiltchev »

Initially I thought that we are talking about host notifications, but it seems like that you are having issues with service notifications during scheduled downtime.
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Ping;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 14:32:24 2018] SERVICE NOTIFICATION: prod_sql;win_sql_server;Ping;OK;notify-service-by-email;OK - 10.226.165.51: rta 122.226ms, lost 0%
Having said that, we haven't been able to recreate the issue in house. We tested both, the fixed and the flexible scheduled downtime, but no recovery notifications were sent during downtime. Was the Ping in fixed or flexible downtime? It would be nice to know, so that we can do some more digging into this.

Also, to rule this out - can you check to see if you have multiple nagios processes running?

Code: Select all

ps -ef | grep nagios.cfg | grep -v grep
Are recovery notifications during scheduled downtime a "common occurrence" for you or this is a "one time off" thing?
Be sure to check out our Knowledgebase for helpful articles and solutions!
micdud
Posts: 7
Joined: Tue Nov 28, 2017 3:29 pm

Re: recovery email being sent while host in recurring downti

Post by micdud »

I'm sorry i was out yesterday. I'm working on info you requested.
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: recovery email being sent while host in recurring downti

Post by lmiltchev »

Sure, send us the info whenever you are ready.
Be sure to check out our Knowledgebase for helpful articles and solutions!
micdud
Posts: 7
Joined: Tue Nov 28, 2017 3:29 pm

Re: recovery email being sent while host in recurring downti

Post by micdud »

Code: Select all

ps -ef | grep nagios.cfg | grep -v grep
nagios   24495     1  2 09:21 ?        00:04:10 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   24590 24495  0 09:21 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
this host sends recovery email everyday twice a day. we are having issue with this host losing ping but that is a different issue.
Locked