Postby mutts » Wed Sep 21, 2011 4:31 pm

I am running Nagios 3.3.1 on an OpenSuse 11.1 system.

I have defined a hostgroup, switches, for monitoring all my network switches. Each switch has it's own configuration file.

I have configured one switch as below:

cat outer-sw4.cfg
define host{
        use                             generic-switch
        host_name                       outer-sw4
        hostgroups                      switches
        alias                           Foundry WS624G 24-port ethernet switch
        contact_groups                  admins,adr-cell
        parents                         cms

# Service definition to monitor switch uptime using check_snmp
define service{
        use                             remote-service
        hostgroup_name                  switches
        service_description             Uptime
        check_command                   check_snmp! -C community -o sysUpTime.0

The thing I don't understand is, Nagios runs the "check_snmp! -C community -o sysUpTime.0" check against all the devices in the switch hostgroup, not just the one the service is defined in. But it's only the chck_snmp one that gets globaly applied. If I add a service definition to ping an interface, that stays on that host only.

The check_snmp command is defined like so:

define command{
        command_name    check_snmp
        command_line    $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$

And the generic-switch template:

# Define a generic switch template

define host{
        name                    generic-switch  ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, switches are monitored round the clock
        check_interval          5               ; Switches are checked every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each switch 10 times (max)
        check_command           check-host-alive        ; Default command to check if routers are "alive"
        notification_period     24x7            ; Send notifications at any time
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE

Is this expected behavior for check_snmp? Or have I got something crazy in my configuration somewhere?

Re: check_snmp checking entire hostgroup

Postby mrb » Thu Sep 22, 2011 2:11 pm

In the service definition you are telling nagios to run the command against the hostgroup.

Change the hostgroup_name in the service definition to the host_name of the single host you want to check and that should only run the check against the one host instead of the hostgroup.
Re: check_snmp checking entire hostgroup

Postby mutts » Thu Sep 22, 2011 2:39 pm you know how many times I looked at that and didn't see that?

That solved it.

