Can't get my ProCurve monitoring to work.

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
jbruyet
Posts: 235
Joined: Wed Dec 28, 2011 12:14 pm

Can't get my ProCurve monitoring to work.

Post by jbruyet »

Hey all, I just downloaded two .cfg files so I can see a little more deeply into my ProCurve switches. HOWEVER, I've hit a snag and I'm not sure what the problem is. Here's the Command Definition:

Code: Select all

define command{ 
  command_name check_hpmemoryfree 
  command_line $USER1$/check_snmp -H $HOSTADDRESS$ -C $ARG1$ -o 1.3.6.1.4.1.11.2.14.11.5.1.1.2.1.1.1.6.1 -t 5 -w $ARG2$ -c $ARG3$ -u bytes -l free 
 } 
and here's the Service Definition:

Code: Select all

# Service definition MEM-FREE
define service{
        use                             generic-service         ; Name of service template to use
        hostgroup                       switches
        service_description             MEM-FREE
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              3
        normal_check_interval           5
        retry_check_interval            1
        notification_interval           240
        notification_period             24x7
        notification_options            c,r
        check_command                   check_hpmemoryfree!nagios!2000:30000000!1000:30000000
        }
and here's what I see on the Nagios web page:

Code: Select all

MEM-FREE 	UNKNOWN 	08-14-2012 14:54:32 	0d 1h 48m 51s 	3/3 	External command error: Timeout: No Response from 192.168.2.20:161. 
BUT, if I run the command from the command line here's what I see:

Code: Select all

[root@link libexec]# ./check_snmp -H 192.168.2.20 -o 1.3.6.1.4.1.11.2.14.11.5.1.1.2.1.1.1.6.1
SNMP OK - 24307576 | iso.3.6.1.4.1.11.2.14.11.5.1.1.2.1.1.1.6.1=24307576
Any ideas on why it works from the command line and not from within Nagios? I copied the config lines from the downloaded file to my file.

Thanks,

Joe B
jbruyet
Posts: 235
Joined: Wed Dec 28, 2011 12:14 pm

Re: Can't get my ProCurve monitoring to work.

Post by jbruyet »

Quick question -- if there's a bad argument in the Command definition would that give me the "No Response" error? If not I can't see anything in either .cfg file that would cause the problem.

Thanks,

Joe B
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Can't get my ProCurve monitoring to work.

Post by agriffin »

The error message you're seeing is from the plugin, not from Nagios itself, so the bad argument is only the cause or the error if it also happens at the command line. Try testing the plugin as the nagios user, and seeing how long it takes to execute:

Code: Select all

# su nagios -s /bin/bash -c "time ./check_snmp -H 192.168.2.20 -o 1.3.6.1.4.1.11.2.14.11.5.1.1.2.1.1.1.6.1"
jbruyet
Posts: 235
Joined: Wed Dec 28, 2011 12:14 pm

Re: Can't get my ProCurve monitoring to work.

Post by jbruyet »

Hi agriffin, running that command took a little less than one second.

Thanks,

Joe B
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Can't get my ProCurve monitoring to work.

Post by agriffin »

And did it give you the output you expected above the timing information (it should have been the same as when you tested it as the root user)?
jbruyet
Posts: 235
Joined: Wed Dec 28, 2011 12:14 pm

Re: Can't get my ProCurve monitoring to work.

Post by jbruyet »

Here's something--I'm experimenting with the arguments and after making the last change I was checking to see when the next check was and noticed that the OID isn'g getting passed along. Check out the Status information:

Code: Select all

Service State Information
Current Status:	
  UNKNOWN  
 (for 3d 0h 5m 41s)
Status Information:	No OIDs specified
Performance Data:	
Current Attempt:	3/3  (HARD state)
Last Check Time:	08-17-2012 13:10:36
Check Type:	ACTIVE
Check Latency / Duration:	0.369 / 0.074 seconds
Next Scheduled Check:  	08-17-2012 13:15:36
Last State Change:	08-14-2012 13:07:32
Last Notification:	N/A (notification 0)
Is This Service Flapping?	
  NO  
 (0.00% state change)
In Scheduled Downtime?	
  NO  
Last Update:	08-17-2012 13:13:05  ( 0d 0h 0m 8s ago)
I'm not fluent enough in Nagios to see what the problem is. Can you or anyone else see why the OID wouldn't get passed along?

Oh yeah, running the command straight up from the command line gave me the correct information.

Thanks,

Joe B
jbruyet
Posts: 235
Joined: Wed Dec 28, 2011 12:14 pm

Re: Can't get my ProCurve monitoring to work.

Post by jbruyet »

I've tried moving $ARG1$ around in the command but no joy. I've even tried running no arguments but I still get the "No OIDs specified" error. I'm running out of things to experiment with.

Thanks,

Joe B
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Can't get my ProCurve monitoring to work.

Post by agriffin »

Can you post an example of a host definition you're using for one of these procurve switches?
jbruyet
Posts: 235
Joined: Wed Dec 28, 2011 12:14 pm

Re: Can't get my ProCurve monitoring to work.

Post by jbruyet »

Hi agriffin, here's one of my Host Definitions:

Code: Select all

define host{
        use             generic-switch
        host_name       OS-2524KVMBch
        alias           ProCurve 2524 OS KVM Bench
        address         192.168.2.18
        hostgroups      switches
        }
I have a couple of basic checks using my Hostgroup definition, a check_ping and a check_snmp for uptime, and they work just fine. I can run the mem_free script from a command line and I get the free memory on the switch; I just can't seem to figure out why my OIDs are staying invisible from my switch.cfg file.

Thanks,

Joe B
jbruyet
Posts: 235
Joined: Wed Dec 28, 2011 12:14 pm

Re: Can't get my ProCurve monitoring to work.

Post by jbruyet »

I can get these checks to work from the command line so I'm thinking it's something with the syntax or the arguments in the command definition, or maybe in the service definition. I've been looking online trying to find some examples that I can compare my definitions to but I'm only finding very generic stuff. For example, one question I have is why is there a "nagios" parameter in my service definition? Does anyone know of any tutorials I can look at to see if my problem might be syntax- or argument-related?

Thanks,

Joe B
Locked