Page 1 of 1

NRPE worked fine and then...

Posted: Fri Apr 27, 2012 2:04 pm
by jbruyet
Hey all, I'm working on getting NRPE to run under Nagios. I set up my own workstation as a guinea pig and after creating my NRPE config file I was able to get a CPU Load service working. I went for a second service and decided to change the host_name to hostgroup in my Service Definitions like I did with my check_nt but no joy; neither service worked. I changed my hostgroup back to host_name and still no joy. I changed back to just the one service, the CPU Load service, and STILL no joy. Now I'm getting this error in my Nagios for the service that worked previously:

UNKNOWN: No handler for that command

As far as I can tell my .cfg file is just like it was when it worked originally. Here's the file. Can anyone see some glaring omissions or anything else that I may have restored incorrectly? The only two "Definitions" I messed with were the Service Definitions:

# >>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<
# Config file create 4/26/12
# >>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<

define host{
name xp_nrpe
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_period 24x7
notification_interval 30
notification_options d,r
contact_groups admins
register 1
}


# >>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<
# Host Definitions
# >>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<

define host{
use xp_nrpe
host_name jobee1
alias Jobeez Workstation
address 192.168.2.22
}

# >>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<
# Host Group Definitions
# >>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<

define hostgroup {
hostgroup_name xp_nrpegroup
alias XP workstations with NRPE
members jobee1
}


# >>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<
# Service Definitions
# >>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<


define service {
use generic-service
host_name jobee1
service_description CPU Load
check_command check_nrpe!check_load
}

If I run this from the command line:

check_nrpe -H IP -p 5666 -c CheckCPU -a warn=100 crit=100 time=1 warn=95 crit=99 time=5 warn=90 crit=95 time=15

I get an "OK CPU Load ok" message. What am I doing wrong?

Thanks,

Joe B

Re: NRPE worked fine and then...

Posted: Sun Apr 29, 2012 4:29 pm
by jbruyet
Well then can anyone see any obscure typos? If this worked at Configuration A, stopped working at Configuration B and still didn't work after being returned to Configuration A it must be something on my end. BUT, I only changed a couple of things and that was in just one file. ANY help here would be greatly appreciated.

Thanks,

Joe B

Re: NRPE worked fine and then...

Posted: Sun Apr 29, 2012 6:26 pm
by jsmurphy
check_command check_nrpe!check_load

check_nrpe -H IP -p 5666 -c CheckCPU
For your host check you are parsing check_load but on the command line you are running checkCPU? Could this be your issue?

Re: NRPE worked fine and then...

Posted: Tue May 01, 2012 1:20 pm
by jbruyet
Hey jsmurphy, that was a good catch. I guess I deleted more than I thought I did. Anyway, I've changed the command syntax per the bottom of this web page: http://www.nsclient.org/nscp/wiki/CheckCPU.

So now the command is:

check_command check_nrpe!checkCPU -a warn=100 crit=100 time=1 warn=95 crit=99 time=5 warn=90 crit=95 time=15

and now I'm getting a new error:

ERROR: Missing argument exception.

I can do the following from the command line:

check_nrpe -H IP -p 5666 -c CheckCPU -a warn=100 crit=100 time=1 warn=95 crit=99 time=5 warn=90 crit=95 time=15

and I get:

OK CPU Load ok.|'1'=3%;100;100 '5'=2%;95;99 '15'=2%;90;95

so I know things are connecting. Apparently I don't have the correct syntax down yet. I was just checking the command syntax on the page and I now see that I'm not doing it exactly like the page has it for the Nagios Configuration. If I use command_line per the instructions I get a "Could not add object property in file" error. Do I need to add another section in my .cfg file to define the command? If so how do I call the command? Do I get rid of my Service Definitions section? I am really flying blind here.

Thanks,

Joe B

Re: NRPE worked fine and then...

Posted: Tue May 01, 2012 1:27 pm
by jbruyet
And now I see that it's working:

Host Service Status Last Check Duration Attempt Status Information
jobee1 CPU Check OK 05-01-2012 11:13:15 0d 0h 17m 19s 1/3 OK CPU Load ok.

How long after a change is made should I wait? When I press F5 to refresh the page I can see the time incrementing. I guess I'll throw in another test from that page and see what happens. BUT, when I add it in I'll head to our other facility to get some things done there then I'll check it when I get back. Sigh.

Thanks,

Joe B

Re: NRPE worked fine and then...

Posted: Tue May 01, 2012 3:22 pm
by jbruyet
OK, my hard disk check is working so I think I have things figured out: 1) Nagios takes a stinkingly long time to update its web pages and 2) the syntax thing in my system doesn't quite work like they have it on the NSclient web page. I have another question now and even though it's related I think I'll start a new thread for it. Thanks a lot jsmurphy! Your sharp eye gave me the answer I needed to move past that hurdle and get this thing working.

Thanks,

Joe B

Re: NRPE worked fine and then...

Posted: Tue May 01, 2012 6:59 pm
by jsmurphy
Glad you sorted it out :) .