Page 5 of 5

Re: Duplicate alert emails.

Posted: Thu Dec 05, 2013 3:21 pm
by vinothsethuram
I stopped nagios and started again. The command ps axu | grep nagios gives the following lines. Is it symbolizing the number of running processors? If so , will it be the reason for duplicate alerts?




nagios 18280 0.0 0.0 14536 3656 ? Ss 15:18 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 18282 0.0 0.0 10000 912 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18283 0.0 0.0 10000 920 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18284 0.0 0.0 10000 908 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18285 0.0 0.0 10000 904 ? S 15:18 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 18286 0.0 0.0 13912 828 ? S 15:18 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 18301 0.0 0.0 103244 832 pts/19 S+ 15:18 0:00 grep nagios

Re: Duplicate alert emails.

Posted: Fri Dec 06, 2013 9:28 am
by scottwilkerson
The Date/Time on these emails are different also.

Do you have check_freshness enabled on the host?

Can you at least send us all the relevant configs and templates for this host obfuscating any sensitive information?

Re: Duplicate alert emails.

Posted: Fri Dec 06, 2013 11:49 am
by vinothsethuram
I didnt change check freshness property. I think it should be default. I'm using generic host and generic service template

Code: Select all

# TEMPLATES.CFG - SAMPLE OBJECT TEMPLATES
#
#
# NOTES: This config file provides you with some example object definition
#        templates that are refered by other host, service, contact, etc.
#        definitions in other config files.
#
#        You don't need to keep these definitions in a separate file from your
#        other object definitions.  This has been done just to make things
#        easier to understand.
#
###############################################################################



###############################################################################
###############################################################################
#
# CONTACT TEMPLATES
#
###############################################################################
###############################################################################

# Generic contact definition template - This is NOT a real contact, just a template!

define contact{
        name                            generic-contact         ; The name of this contact template
        service_notification_period     office_Time _Period                    ; service notifications can be sent anytime
        host_notification_period        office_Time _Period                    ; host notifications can be sent anytime
        service_notification_options    w,u,c,r,f,s             ; send notifications for all service states, flapping events, and scheduled downtime events
        host_notification_options       d,u,r,f,s               ; send notifications for all host states, flapping events, and scheduled downtime events
        service_notification_commands   notify-service-by-email ; send service notifications via email
        host_notification_commands      notify-host-by-email    ; send host notifications via email
        register                        0                       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
        }




###############################################################################
###############################################################################
#
# HOST TEMPLATES
#
###############################################################################
###############################################################################

# Generic host definition template - This is NOT a real host, just a template!

define host{
 name                            generic-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
        notification_period             24x7            ; Send host notifications at any time
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

# Linux host definition template - This is NOT a real host, just a template!

define host{
        name                            linux-server    ; The name of this host template
        use                             generic-host    ; This template inherits other values from the generic-host template
        check_period                    24x7            ; By default, Linux hosts are checked round the clock
        check_interval                  5               ; Actively check the host every 5 minutes
        retry_interval                  1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts              10              ; Check each Linux host 10 times (max)
        check_command                   check-host-alive ; Default command to check Linux hosts
        notification_period             workhours       ; Linux admins hate to be woken up, so we only notify during the day
                                                        ; Note that the notification_period variable is being overridden from
                                                        ; the value that is inherited from the generic-host template!
        notification_interval           120             ; Resend notifications every 2 hours
        notification_options            d,u,r           ; Only send notifications for specific host states
        contact_groups                  admins          ; Notifications get sent to the admins by default
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }



# Windows host definition template - This is NOT a real host, just a template!

define host{
        name                    windows-server  ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, Windows servers are monitored round the clock
        check_interval          5               ; Actively check the server every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each server 10 times (max)
        check_command           check-host-alive        ; Default command to check if servers are "alive"
        notification_period     24x7            ; Send notification out at any time - day or night
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        hostgroups              windows-servers ; Host groups that Windows servers should be a member of
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE
        }

# We define a generic printer template that can be used for most printers we monitor

define host{
        name                    generic-printer ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, printers are monitored round the clock
        check_interval          5               ; Actively check the printer every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each printer 10 times (max)
        check_command           check-host-alive        ; Default command to check if printers are "alive"
        notification_period     workhours               ; Printers are only used during the workday
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE
        }


# Define a template for switches that we can reuse
define host{
        name                    generic-switch  ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, switches are monitored round the clock
        check_interval          5               ; Switches are checked every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each switch 10 times (max)
        check_command           check-host-alive        ; Default command to check if routers are "alive"
        notification_period     24x7            ; Send notifications at any time
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE
        }




###############################################################################
###############################################################################
#
# SERVICE TEMPLATES
#
###############################################################################
###############################################################################

# Generic service definition template - This is NOT a real service, just a template!

define service{
        name                            generic-service         ; The 'name' of this service template
        active_checks_enabled           1                       ; Active service checks are enabled
 passive_checks_enabled          1                       ; Passive service checks are enabled/accepted
        parallelize_check               1                       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
        obsess_over_service             1                       ; We should obsess over this service (if necessary)
        check_freshness                 0                       ; Default is to NOT check service 'freshness'
        notifications_enabled           1                       ; Service notifications are enabled
        event_handler_enabled           1                       ; Service event handler is enabled
        flap_detection_enabled          1                       ; Flap detection is enabled
        process_perf_data               1                       ; Process performance data
        retain_status_information       1                       ; Retain status information across program restarts
        retain_nonstatus_information    1                       ; Retain non-status information across program restarts
        is_volatile                     0                       ; The service is not volatile
        check_period                    office_Time _Period    ; The service can be checked at any time of the day
        max_check_attempts              3                       ; Re-check the service up to 3 times in order to determine its final (hard) state
        normal_check_interval           5                       ; Check the service every 10 minutes under normal conditions
        retry_check_interval            1                       ; Re-check the service every two minutes until a hard state can be determined
        contact_groups                  admins                  ; Notifications get sent out to everyone in the 'admins' group
        notification_options            w,u,c,r                 ; Send notifications about warning, unknown, critical, and recovery events
        notification_interval           60                      ; Re-notify about service problems every hour
        notification_period             office_Time _Period    ; Notifications can be sent out at any time
         register                        0                      ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }


# Local service definition template - This is NOT a real service, just a template!

define service{
        name                            local-service           ; The name of this service template
        use                             generic-service         ; Inherit default values from the generic-service definition
        max_check_attempts              4                       ; Re-check the service up to 4 times in order to determine its final (hard) state
        normal_check_interval           5                       ; Check the service every 5 minutes under normal conditions
        retry_check_interval            1                       ; Re-check the service every minute until a hard state can be determined
        register                        0                       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }

Re: Duplicate alert emails.

Posted: Fri Dec 06, 2013 1:33 pm
by sreinhardt
Well in addition to any config issues, it seems that you have duplicate nagios daemons running. The workers are expected, but the -ud */nagios.cfg ones are likely duplicate processes that are not related. Lets do:

Code: Select all

service nagios stop
killall -9 nagios
service nagios start
ps -ef | grep bin/nagios
ps aux | grep bin/nagios

Re: Duplicate alert emails.

Posted: Fri Dec 06, 2013 2:43 pm
by vinothsethuram
Executed those commands.


Here is the output. Please let me know if this worked as expected..


-bash-4.1# service nagios stop
nagios (pid 18629 18628 18627 18626 18625 18623) is running...
Stopping nagios: [ OK ]
-bash-4.1# killall -9 nagios
nagios: no process killed
-bash-4.1# service nagios start
nagios is stopped
Starting nagios: [ OK ]
-bash-4.1# ps -ef | grep bin/nagios
nagios 31463 1 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 31465 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31466 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31467 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31468 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios --w orker /usr/local/nagios/var/rw/nagios.qh
nagios 31469 31463 0 14:39 ? 00:00:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 31473 31382 0 14:39 pts/0 00:00:00 grep bin/nagios
-bash-4.1#
-bash-4.1# ps aux | grep bin/nagios
nagios 31463 0.0 0.0 14756 3992 ? Ss 14:39 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
nagios 31465 0.0 0.0 10084 1072 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31466 0.0 0.0 10000 924 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31467 0.0 0.0 10000 916 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31468 0.0 0.0 10000 916 ? S 14:39 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 31469 0.0 0.0 13812 736 ? S 14:39 0:00 /usr/local/nagios/bin/nagios -ud /usr/local/nagios/etc/nagios.cfg
root 31503 0.0 0.0 103244 828 pts/0 S+ 14:41 0:00 grep bin/nagios
-bash-4.1#

Re: Duplicate alert emails.

Posted: Fri Dec 06, 2013 3:02 pm
by sreinhardt
My mistake there, I forget that AUX does not show parent and process pid, just process. This threw me off, the -ef cleared up that thought. We are going to look at those templates\configs for a little bit here and discuss.

Re: Duplicate alert emails.

Posted: Fri Dec 06, 2013 6:40 pm
by vinothsethuram
Can't understand.. Please brief it.

Re: Duplicate alert emails.

Posted: Mon Dec 09, 2013 1:54 pm
by slansing
Can you also share your 'admins' contact definition?

Re: Duplicate alert emails.

Posted: Mon Dec 09, 2013 2:29 pm
by vinothsethuram
Now its working fine. Sending alert for every one hour. Its due to multiple nagios instances. I killed all instances and restarted nagios to have only one instance.