Duplicate alert emails.
-
- Posts: 147
- Joined: Thu Nov 07, 2013 11:44 am
Re: Duplicate alert emails.
I stopped nagios and started again. The command ps axu | grep nagios gives the following lines. Is it symbolizing the number of running processors? If so , will it be the reason for duplicate alerts?
nagios 18280 0.0 0.0 14536 3656 ? Ss 15:18 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 18282 0.0 0.0 10000 912 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18283 0.0 0.0 10000 920 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18284 0.0 0.0 10000 908 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18285 0.0 0.0 10000 904 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18286 0.0 0.0 13912 828 ? S 15:18 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 18301 0.0 0.0 103244 832 pts/19 S+ 15:18 0:00 grep nagios
nagios 18280 0.0 0.0 14536 3656 ? Ss 15:18 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 18282 0.0 0.0 10000 912 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18283 0.0 0.0 10000 920 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18284 0.0 0.0 10000 908 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18285 0.0 0.0 10000 904 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18286 0.0 0.0 13912 828 ? S 15:18 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 18301 0.0 0.0 103244 832 pts/19 S+ 15:18 0:00 grep nagios
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Duplicate alert emails.
The Date/Time on these emails are different also.
Do you have check_freshness enabled on the host?
Can you at least send us all the relevant configs and templates for this host obfuscating any sensitive information?
Do you have check_freshness enabled on the host?
Can you at least send us all the relevant configs and templates for this host obfuscating any sensitive information?
-
- Posts: 147
- Joined: Thu Nov 07, 2013 11:44 am
Re: Duplicate alert emails.
I didnt change check freshness property. I think it should be default. I'm using generic host and generic service template
Code: Select all
# TEMPLATES.CFG - SAMPLE OBJECT TEMPLATES
#
#
# NOTES: This config file provides you with some example object definition
# templates that are refered by other host, service, contact, etc.
# definitions in other config files.
#
# You don't need to keep these definitions in a separate file from your
# other object definitions. This has been done just to make things
# easier to understand.
#
###############################################################################
###############################################################################
###############################################################################
#
# CONTACT TEMPLATES
#
###############################################################################
###############################################################################
# Generic contact definition template - This is NOT a real contact, just a template!
define contact{
name generic-contact ; The name of this contact template
service_notification_period office_Time _Period ; service notifications can be sent anytime
host_notification_period office_Time _Period ; host notifications can be sent anytime
service_notification_options w,u,c,r,f,s ; send notifications for all service states, flapping events, and scheduled downtime events
host_notification_options d,u,r,f,s ; send notifications for all host states, flapping events, and scheduled downtime events
service_notification_commands notify-service-by-email ; send service notifications via email
host_notification_commands notify-host-by-email ; send host notifications via email
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
}
###############################################################################
###############################################################################
#
# HOST TEMPLATES
#
###############################################################################
###############################################################################
# Generic host definition template - This is NOT a real host, just a template!
define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
# Linux host definition template - This is NOT a real host, just a template!
define host{
name linux-server ; The name of this host template
use generic-host ; This template inherits other values from the generic-host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
check_interval 5 ; Actively check the host every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each Linux host 10 times (max)
check_command check-host-alive ; Default command to check Linux hosts
notification_period workhours ; Linux admins hate to be woken up, so we only notify during the day
; Note that the notification_period variable is being overridden from
; the value that is inherited from the generic-host template!
notification_interval 120 ; Resend notifications every 2 hours
notification_options d,u,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
# Windows host definition template - This is NOT a real host, just a template!
define host{
name windows-server ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24x7 ; By default, Windows servers are monitored round the clock
check_interval 5 ; Actively check the server every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each server 10 times (max)
check_command check-host-alive ; Default command to check if servers are "alive"
notification_period 24x7 ; Send notification out at any time - day or night
notification_interval 30 ; Resend notifications every 30 minutes
notification_options d,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
hostgroups windows-servers ; Host groups that Windows servers should be a member of
register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE
}
# We define a generic printer template that can be used for most printers we monitor
define host{
name generic-printer ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24x7 ; By default, printers are monitored round the clock
check_interval 5 ; Actively check the printer every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each printer 10 times (max)
check_command check-host-alive ; Default command to check if printers are "alive"
notification_period workhours ; Printers are only used during the workday
notification_interval 30 ; Resend notifications every 30 minutes
notification_options d,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE
}
# Define a template for switches that we can reuse
define host{
name generic-switch ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24x7 ; By default, switches are monitored round the clock
check_interval 5 ; Switches are checked every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each switch 10 times (max)
check_command check-host-alive ; Default command to check if routers are "alive"
notification_period 24x7 ; Send notifications at any time
notification_interval 30 ; Resend notifications every 30 minutes
notification_options d,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE
}
###############################################################################
###############################################################################
#
# SERVICE TEMPLATES
#
###############################################################################
###############################################################################
# Generic service definition template - This is NOT a real service, just a template!
define service{
name generic-service ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
is_volatile 0 ; The service is not volatile
check_period office_Time _Period ; The service can be checked at any time of the day
max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state
normal_check_interval 5 ; Check the service every 10 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every two minutes until a hard state can be determined
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 60 ; Re-notify about service problems every hour
notification_period office_Time _Period ; Notifications can be sent out at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
# Local service definition template - This is NOT a real service, just a template!
define service{
name local-service ; The name of this service template
use generic-service ; Inherit default values from the generic-service definition
max_check_attempts 4 ; Re-check the service up to 4 times in order to determine its final (hard) state
normal_check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every minute until a hard state can be determined
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
-
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Duplicate alert emails.
Well in addition to any config issues, it seems that you have duplicate nagios daemons running. The workers are expected, but the -ud */nagios.cfg ones are likely duplicate processes that are not related. Lets do:
Code: Select all
service nagios stop
killall -9 nagios
service nagios start
ps -ef | grep bin/nagios
ps aux | grep bin/nagios
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
- Posts: 147
- Joined: Thu Nov 07, 2013 11:44 am
Re: Duplicate alert emails.
Executed those commands.
Here is the output. Please let me know if this worked as expected..
-bash-4.1# service nagios stop
nagios (pid 18629 18628 18627 18626 18625 18623) is running...
Stopping nagios: [ OK ]
-bash-4.1# killall -9 nagios
nagios: no process killed
-bash-4.1# service nagios start
nagios is stopped
Starting nagios: [ OK ]
-bash-4.1# ps -ef | grep bin/nagios
nagios 31463 1 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 31465 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31466 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31467 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31468 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31469 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 31473 31382 0 14:39 pts/0 00:00:00 grep bin/nagios
-bash-4.1#
-bash-4.1# ps aux | grep bin/nagios
nagios 31463 0.0 0.0 14756 3992 ? Ss 14:39 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 31465 0.0 0.0 10084 1072 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31466 0.0 0.0 10000 924 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31467 0.0 0.0 10000 916 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31468 0.0 0.0 10000 916 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31469 0.0 0.0 13812 736 ? S 14:39 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 31503 0.0 0.0 103244 828 pts/0 S+ 14:41 0:00 grep bin/nagios
-bash-4.1#
Here is the output. Please let me know if this worked as expected..
-bash-4.1# service nagios stop
nagios (pid 18629 18628 18627 18626 18625 18623) is running...
Stopping nagios: [ OK ]
-bash-4.1# killall -9 nagios
nagios: no process killed
-bash-4.1# service nagios start
nagios is stopped
Starting nagios: [ OK ]
-bash-4.1# ps -ef | grep bin/nagios
nagios 31463 1 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 31465 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31466 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31467 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31468 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31469 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 31473 31382 0 14:39 pts/0 00:00:00 grep bin/nagios
-bash-4.1#
-bash-4.1# ps aux | grep bin/nagios
nagios 31463 0.0 0.0 14756 3992 ? Ss 14:39 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 31465 0.0 0.0 10084 1072 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31466 0.0 0.0 10000 924 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31467 0.0 0.0 10000 916 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31468 0.0 0.0 10000 916 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31469 0.0 0.0 13812 736 ? S 14:39 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 31503 0.0 0.0 103244 828 pts/0 S+ 14:41 0:00 grep bin/nagios
-bash-4.1#
-
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Duplicate alert emails.
My mistake there, I forget that AUX does not show parent and process pid, just process. This threw me off, the -ef cleared up that thought. We are going to look at those templates\configs for a little bit here and discuss.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
- Posts: 147
- Joined: Thu Nov 07, 2013 11:44 am
Re: Duplicate alert emails.
Can't understand.. Please brief it.
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Duplicate alert emails.
Can you also share your 'admins' contact definition?
-
- Posts: 147
- Joined: Thu Nov 07, 2013 11:44 am
Re: Duplicate alert emails.
Now its working fine. Sending alert for every one hour. Its due to multiple nagios instances. I killed all instances and restarted nagios to have only one instance.