Duplicate alert emails.

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: Duplicate alert emails.

Post by vinothsethuram »

I stopped nagios and started again. The command ps axu | grep nagios gives the following lines. Is it symbolizing the number of running processors? If so , will it be the reason for duplicate alerts?




nagios 18280 0.0 0.0 14536 3656 ? Ss 15:18 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 18282 0.0 0.0 10000 912 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18283 0.0 0.0 10000 920 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18284 0.0 0.0 10000 908 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18285 0.0 0.0 10000 904 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18286 0.0 0.0 13912 828 ? S 15:18 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 18301 0.0 0.0 103244 832 pts/19 S+ 15:18 0:00 grep nagios
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Duplicate alert emails.

Post by scottwilkerson »

The Date/Time on these emails are different also.

Do you have check_freshness enabled on the host?

Can you at least send us all the relevant configs and templates for this host obfuscating any sensitive information?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: Duplicate alert emails.

Post by vinothsethuram »

I didnt change check freshness property. I think it should be default. I'm using generic host and generic service template

Code: Select all

# TEMPLATES.CFG - SAMPLE OBJECT TEMPLATES
#
#
# NOTES: This config file provides you with some example object definition
#        templates that are refered by other host, service, contact, etc.
#        definitions in other config files.
#
#        You don't need to keep these definitions in a separate file from your
#        other object definitions.  This has been done just to make things
#        easier to understand.
#
###############################################################################



###############################################################################
###############################################################################
#
# CONTACT TEMPLATES
#
###############################################################################
###############################################################################

# Generic contact definition template - This is NOT a real contact, just a template!

define contact{
        name                            generic-contact         ; The name of this contact template
        service_notification_period     office_Time _Period                    ; service notifications can be sent anytime
        host_notification_period        office_Time _Period                    ; host notifications can be sent anytime
        service_notification_options    w,u,c,r,f,s             ; send notifications for all service states, flapping events, and scheduled downtime events
        host_notification_options       d,u,r,f,s               ; send notifications for all host states, flapping events, and scheduled downtime events
        service_notification_commands   notify-service-by-email ; send service notifications via email
        host_notification_commands      notify-host-by-email    ; send host notifications via email
        register                        0                       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
        }




###############################################################################
###############################################################################
#
# HOST TEMPLATES
#
###############################################################################
###############################################################################

# Generic host definition template - This is NOT a real host, just a template!

define host{
 name                            generic-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
        notification_period             24x7            ; Send host notifications at any time
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

# Linux host definition template - This is NOT a real host, just a template!

define host{
        name                            linux-server    ; The name of this host template
        use                             generic-host    ; This template inherits other values from the generic-host template
        check_period                    24x7            ; By default, Linux hosts are checked round the clock
        check_interval                  5               ; Actively check the host every 5 minutes
        retry_interval                  1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts              10              ; Check each Linux host 10 times (max)
        check_command                   check-host-alive ; Default command to check Linux hosts
        notification_period             workhours       ; Linux admins hate to be woken up, so we only notify during the day
                                                        ; Note that the notification_period variable is being overridden from
                                                        ; the value that is inherited from the generic-host template!
        notification_interval           120             ; Resend notifications every 2 hours
        notification_options            d,u,r           ; Only send notifications for specific host states
        contact_groups                  admins          ; Notifications get sent to the admins by default
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }



# Windows host definition template - This is NOT a real host, just a template!

define host{
        name                    windows-server  ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, Windows servers are monitored round the clock
        check_interval          5               ; Actively check the server every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each server 10 times (max)
        check_command           check-host-alive        ; Default command to check if servers are "alive"
        notification_period     24x7            ; Send notification out at any time - day or night
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        hostgroups              windows-servers ; Host groups that Windows servers should be a member of
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE
        }

# We define a generic printer template that can be used for most printers we monitor

define host{
        name                    generic-printer ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, printers are monitored round the clock
        check_interval          5               ; Actively check the printer every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each printer 10 times (max)
        check_command           check-host-alive        ; Default command to check if printers are "alive"
        notification_period     workhours               ; Printers are only used during the workday
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE
        }


# Define a template for switches that we can reuse
define host{
        name                    generic-switch  ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, switches are monitored round the clock
        check_interval          5               ; Switches are checked every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each switch 10 times (max)
        check_command           check-host-alive        ; Default command to check if routers are "alive"
        notification_period     24x7            ; Send notifications at any time
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE
        }




###############################################################################
###############################################################################
#
# SERVICE TEMPLATES
#
###############################################################################
###############################################################################

# Generic service definition template - This is NOT a real service, just a template!

define service{
        name                            generic-service         ; The 'name' of this service template
        active_checks_enabled           1                       ; Active service checks are enabled
 passive_checks_enabled          1                       ; Passive service checks are enabled/accepted
        parallelize_check               1                       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
        obsess_over_service             1                       ; We should obsess over this service (if necessary)
        check_freshness                 0                       ; Default is to NOT check service 'freshness'
        notifications_enabled           1                       ; Service notifications are enabled
        event_handler_enabled           1                       ; Service event handler is enabled
        flap_detection_enabled          1                       ; Flap detection is enabled
        process_perf_data               1                       ; Process performance data
        retain_status_information       1                       ; Retain status information across program restarts
        retain_nonstatus_information    1                       ; Retain non-status information across program restarts
        is_volatile                     0                       ; The service is not volatile
        check_period                    office_Time _Period    ; The service can be checked at any time of the day
        max_check_attempts              3                       ; Re-check the service up to 3 times in order to determine its final (hard) state
        normal_check_interval           5                       ; Check the service every 10 minutes under normal conditions
        retry_check_interval            1                       ; Re-check the service every two minutes until a hard state can be determined
        contact_groups                  admins                  ; Notifications get sent out to everyone in the 'admins' group
        notification_options            w,u,c,r                 ; Send notifications about warning, unknown, critical, and recovery events
        notification_interval           60                      ; Re-notify about service problems every hour
        notification_period             office_Time _Period    ; Notifications can be sent out at any time
         register                        0                      ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }


# Local service definition template - This is NOT a real service, just a template!

define service{
        name                            local-service           ; The name of this service template
        use                             generic-service         ; Inherit default values from the generic-service definition
        max_check_attempts              4                       ; Re-check the service up to 4 times in order to determine its final (hard) state
        normal_check_interval           5                       ; Check the service every 5 minutes under normal conditions
        retry_check_interval            1                       ; Re-check the service every minute until a hard state can be determined
        register                        0                       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Duplicate alert emails.

Post by sreinhardt »

Well in addition to any config issues, it seems that you have duplicate nagios daemons running. The workers are expected, but the -ud */nagios.cfg ones are likely duplicate processes that are not related. Lets do:

Code: Select all

service nagios stop
killall -9 nagios
service nagios start
ps -ef | grep bin/nagios
ps aux | grep bin/nagios
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: Duplicate alert emails.

Post by vinothsethuram »

Executed those commands.


Here is the output. Please let me know if this worked as expected..


-bash-4.1# service nagios stop
nagios (pid 18629 18628 18627 18626 18625 18623) is running...
Stopping nagios: [ OK ]
-bash-4.1# killall -9 nagios
nagios: no process killed
-bash-4.1# service nagios start
nagios is stopped
Starting nagios: [ OK ]
-bash-4.1# ps -ef | grep bin/nagios
nagios 31463 1 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 31465 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31466 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31467 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31468 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31469 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 31473 31382 0 14:39 pts/0 00:00:00 grep bin/nagios
-bash-4.1#
-bash-4.1# ps aux | grep bin/nagios
nagios 31463 0.0 0.0 14756 3992 ? Ss 14:39 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 31465 0.0 0.0 10084 1072 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31466 0.0 0.0 10000 924 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31467 0.0 0.0 10000 916 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31468 0.0 0.0 10000 916 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31469 0.0 0.0 13812 736 ? S 14:39 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 31503 0.0 0.0 103244 828 pts/0 S+ 14:41 0:00 grep bin/nagios
-bash-4.1#
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Duplicate alert emails.

Post by sreinhardt »

My mistake there, I forget that AUX does not show parent and process pid, just process. This threw me off, the -ef cleared up that thought. We are going to look at those templates\configs for a little bit here and discuss.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: Duplicate alert emails.

Post by vinothsethuram »

Can't understand.. Please brief it.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Duplicate alert emails.

Post by slansing »

Can you also share your 'admins' contact definition?
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: Duplicate alert emails.

Post by vinothsethuram »

Now its working fine. Sending alert for every one hour. Its due to multiple nagios instances. I killed all instances and restarted nagios to have only one instance.
Locked