recovery email being sent while host in recurring downtime
recovery email being sent while host in recurring downtime
Hi,
i noticed this after upgrading to 5.5.x.
Host is in recurring downtime goes down. all good no email notification and then machines comes back up while still in recurring downtime and we are getting recovery email.
how can i make this behavior stop.
Mike
i noticed this after upgrading to 5.5.x.
Host is in recurring downtime goes down. all good no email notification and then machines comes back up while still in recurring downtime and we are getting recovery email.
how can i make this behavior stop.
Mike
Re: recovery email being sent while host in recurring downti
What version are you on now? 5.5.2 resolved some problems with recurring downtime. Upgrade to this version if the machine ins't already there and let us us know if the behavior continues.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: recovery email being sent while host in recurring downti
upgrading to 5.5.2 did fix a lot of recurring downtime issues but not this one.
we are getting recovery email while machine is still in recurring downtime.
to be clear machine goes into recurring downtime (scheduled downtime). after that we reboot machine (no notification about machine going down) so far so good. while machine still in recurring/scheduled downtime after machine comes back online we are getting ping recovery notification which should of been suppressed due to recurring/scheduled downtime.
we are getting recovery email while machine is still in recurring downtime.
to be clear machine goes into recurring downtime (scheduled downtime). after that we reboot machine (no notification about machine going down) so far so good. while machine still in recurring/scheduled downtime after machine comes back online we are getting ping recovery notification which should of been suppressed due to recurring/scheduled downtime.
Re: recovery email being sent while host in recurring downti
Can you show us the actual recovery email notification that you received?
Run the following commands and show the output:
where you substitute <hostname> with the actual hostname of the "problem" host.
Run the following commands and show the output:
Code: Select all
/usr/local/nagios/bin/nagios -V
grep -i '<hostname>' /usr/local/nagios/var/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: recovery email being sent while host in recurring downti
im not sure why this host has ping as service instead host check but either way it shouldnt alert.
output:
grep -i 'win_sql_server' /usr/local/nagios/var/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
[Fri Aug 10 00:00:00 2018] CURRENT HOST STATE: win_sql_server;UP;HARD;1;OK - 10.226.165.51: rta 122.255ms, lost 0%
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;CPU Usage 80/90;OK;HARD;1;CPU Load 0% (5 min average)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Drive C: Disk Usage 80/95;OK;HARD;1;C:\ - total: 39.66 Gb - used: 25.53 Gb (64%) - free 14.12 Gb (36%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Drive E: Disk Usage 90/95;OK;HARD;1;E:\ - total: 40.00 Gb - used: 32.26 Gb (81%) - free 7.74 Gb (19%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Memory Usage 90/95;OK;HARD;1;Memory usage: total:17262.92 MB - used: 14687.72 MB (85%) - free: 2575.21 M/usr/local/nagios/bin/nagios -VB (15%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;NSClient Status;OK;HARD;1;OK: All services are in their appropriate state.
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Ping;OK;HARD;1;OK - 10.226.165.51: rta 122.237ms, lost 0%
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;SQL Core Services;OK;HARD;1;sqlserveragent: Started - mssqlserver: Started
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Uptime;OK;HARD;1;System Uptime - 272 day(s) 5 hour(s) 15 minute(s)
[Fri Aug 10 13:14:59 2018] HOST DOWNTIME ALERT: win_sql_server;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;CPU Usage 80/90;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Drive C: Disk Usage 80/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Drive E: Disk Usage 90/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Memory Usage 90/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;NSClient Status;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Ping;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:15:00 2018] SERVICE DOWNTIME ALERT: win_sql_server;SQL Core Services;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:15:00 2018] SERVICE DOWNTIME ALERT: win_sql_server;Uptime;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 14:25:30 2018] SERVICE ALERT: win_sql_server;Ping;CRITICAL;SOFT;1;CRITICAL - 10.226.165.51: rta 650.163ms, lost 0%
[Fri Aug 10 14:27:30 2018] SERVICE ALERT: win_sql_server;Ping;CRITICAL;HARD;3;CRITICAL - 10.226.165.51: rta 672.770ms, lost 0%
[Fri Aug 10 14:32:24 2018] SERVICE NOTIFICATION: prod_sql;win_sql_server;Ping;OK;notify-service-by-email;OK - 10.226.165.51: rta 122.226ms, lost 0%
[Fri Aug 10 14:32:24 2018] SERVICE ALERT: win_sql_server;Ping;OK;HARD;1;OK - 10.226.165.51: rta 122.226ms, lost 0%
/usr/local/nagios/bin/nagios -V
Nagios Core 4.4.1
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2018-06-25
License: GPL
Website: https://www.nagios.org
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License version 2 as
published by the Free Software Foundation.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
output:
grep -i 'win_sql_server' /usr/local/nagios/var/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
[Fri Aug 10 00:00:00 2018] CURRENT HOST STATE: win_sql_server;UP;HARD;1;OK - 10.226.165.51: rta 122.255ms, lost 0%
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;CPU Usage 80/90;OK;HARD;1;CPU Load 0% (5 min average)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Drive C: Disk Usage 80/95;OK;HARD;1;C:\ - total: 39.66 Gb - used: 25.53 Gb (64%) - free 14.12 Gb (36%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Drive E: Disk Usage 90/95;OK;HARD;1;E:\ - total: 40.00 Gb - used: 32.26 Gb (81%) - free 7.74 Gb (19%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Memory Usage 90/95;OK;HARD;1;Memory usage: total:17262.92 MB - used: 14687.72 MB (85%) - free: 2575.21 M/usr/local/nagios/bin/nagios -VB (15%)
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;NSClient Status;OK;HARD;1;OK: All services are in their appropriate state.
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Ping;OK;HARD;1;OK - 10.226.165.51: rta 122.237ms, lost 0%
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;SQL Core Services;OK;HARD;1;sqlserveragent: Started - mssqlserver: Started
[Fri Aug 10 00:00:00 2018] CURRENT SERVICE STATE: win_sql_server;Uptime;OK;HARD;1;System Uptime - 272 day(s) 5 hour(s) 15 minute(s)
[Fri Aug 10 13:14:59 2018] HOST DOWNTIME ALERT: win_sql_server;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;CPU Usage 80/90;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Drive C: Disk Usage 80/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Drive E: Disk Usage 90/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Memory Usage 90/95;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;NSClient Status;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Ping;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:15:00 2018] SERVICE DOWNTIME ALERT: win_sql_server;SQL Core Services;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 13:15:00 2018] SERVICE DOWNTIME ALERT: win_sql_server;Uptime;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 14:25:30 2018] SERVICE ALERT: win_sql_server;Ping;CRITICAL;SOFT;1;CRITICAL - 10.226.165.51: rta 650.163ms, lost 0%
[Fri Aug 10 14:27:30 2018] SERVICE ALERT: win_sql_server;Ping;CRITICAL;HARD;3;CRITICAL - 10.226.165.51: rta 672.770ms, lost 0%
[Fri Aug 10 14:32:24 2018] SERVICE NOTIFICATION: prod_sql;win_sql_server;Ping;OK;notify-service-by-email;OK - 10.226.165.51: rta 122.226ms, lost 0%
[Fri Aug 10 14:32:24 2018] SERVICE ALERT: win_sql_server;Ping;OK;HARD;1;OK - 10.226.165.51: rta 122.226ms, lost 0%
/usr/local/nagios/bin/nagios -V
Nagios Core 4.4.1
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2018-06-25
License: GPL
Website: https://www.nagios.org
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License version 2 as
published by the Free Software Foundation.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
Re: recovery email being sent while host in recurring downti
I'm currently working on reproducing this error and would appreciate if you could PM me a profile(Admin > System Config > System Profile > Download System Profile).
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: recovery email being sent while host in recurring downti
Initially I thought that we are talking about host notifications, but it seems like that you are having issues with service notifications during scheduled downtime.
Also, to rule this out - can you check to see if you have multiple nagios processes running?
Are recovery notifications during scheduled downtime a "common occurrence" for you or this is a "one time off" thing?
Having said that, we haven't been able to recreate the issue in house. We tested both, the fixed and the flexible scheduled downtime, but no recovery notifications were sent during downtime. Was the Ping in fixed or flexible downtime? It would be nice to know, so that we can do some more digging into this.[Fri Aug 10 13:14:59 2018] SERVICE DOWNTIME ALERT: win_sql_server;Ping;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 10 14:32:24 2018] SERVICE NOTIFICATION: prod_sql;win_sql_server;Ping;OK;notify-service-by-email;OK - 10.226.165.51: rta 122.226ms, lost 0%
Also, to rule this out - can you check to see if you have multiple nagios processes running?
Code: Select all
ps -ef | grep nagios.cfg | grep -v grep
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: recovery email being sent while host in recurring downti
I'm sorry i was out yesterday. I'm working on info you requested.
Re: recovery email being sent while host in recurring downti
Sure, send us the info whenever you are ready.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: recovery email being sent while host in recurring downti
Code: Select all
ps -ef | grep nagios.cfg | grep -v grep
nagios 24495 1 2 09:21 ? 00:04:10 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 24590 24495 0 09:21 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg