*SOLVED*check_procs says user doesn't exist. Works manually

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
puppynut5
Posts: 10
Joined: Tue Jun 11, 2013 3:15 pm

*SOLVED*check_procs says user doesn't exist. Works manually

Post by puppynut5 »

Hello,

I have an issue where Nagios is returning user does not exist on a command such as this...

define service{
use local-service
host_name mastervm1
service_description check_user_procs
check_command check_user_procs!1:!resque-1.24.1!sas
}

With command configured

# Check Procs User
define command{
command_name check_user_procs
command_line $USER1$/check_procs -w $ARG1$ -a $ARG2$ -u $ARG3$
}


When the command is run through Nagios it errors out returning no user sas.

If I run the command locally using the switches (for localhost rather than remote) it finds the user.

I'm wondering if this is because of nrpe.cfg having dont_blame_nrpe=0

I'm just confused because if it didn't allow arguments to pass, how did it know I was looking for username rather than the default status?

Sorry for the waste of time, I just want to confirm this before I have them change it in chef.
Last edited by puppynut5 on Fri Jun 14, 2013 10:12 am, edited 1 time in total.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: check_procs says user doesn't exist. Works manually

Post by slansing »

What is the exact error it outputs? Can you make the dont_blame change to one of your hosts and check against that one to see if the error occurs?
puppynut5
Posts: 10
Joined: Tue Jun 11, 2013 3:15 pm

Re: check_procs says user doesn't exist. Works manually

Post by puppynut5 »

I did make the change on one of the hosts. I found that our chef doesn't automatically push changes back. So I have to run chef-client manually to override it. So I set dont_blame_nrpe=1

And forced a recheck of the command. The error I was getting before setting it was.

[1370980932] SERVICE ALERT: mastervm1;check_user_procs;UNKNOWN;SOFT;1;check_procs: User name was not found - sas
[1370980992] SERVICE ALERT: mastervm1;check_user_procs;UNKNOWN;SOFT;2;check_procs: User name was not found - sas
[1370981052] SERVICE ALERT: mastervm1;check_user_procs;UNKNOWN;SOFT;3;check_procs: User name was not found - sas
[1370981112] SERVICE ALERT: mastervm1;check_user_procs;UNKNOWN;HARD;4;check_procs: User name was not found - sas


And now I'm still getting the same thing... I've completely restarted the nagios server, forced the check.

I only setup this command in command.cfg

I shouldn't need to set it up anywhere else since NRPE can pass arguments through now right?
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: check_procs says user doesn't exist. Works manually

Post by slansing »

Have you tried running the command with a user other than "sas?" One that is guaranteed to be on every installation by default? I would recommend manually running the command rather than making all the changes, and then testing through the interface. This would mean running something like the following from the command line:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H <hostaddress> -c check_procs -w 1: -a resque-1.24.1 -u nagios
You are running the command locally as you have it defined now, not against the remote host. You have the service defined correctly for that host, except you are using a command definition that is a local check, not a check_nrpe one.
puppynut5
Posts: 10
Joined: Tue Jun 11, 2013 3:15 pm

Re: check_procs says user doesn't exist. Works manually

Post by puppynut5 »

It refuses to work when run through NRPE. It still gives the same error.

I copied the script check_procs to /tmp in mastervm1 & ran it there with the same args.

-bash-4.1# ./check_procs -H localhost -w 1: -a resque-1.24.1 -u sas
./check_procs: invalid option -- 'H'
Usage:
check_procs -w <range> -c <range> [-m metric] [-s state] [-p ppid]
[-u user] [-r rss] [-z vsz] [-P %cpu] [-a argument-array]
[-C command] [-t timeout] [-v]
-bash-4.1# ./check_procs -w 1: -a resque-1.24.1 -u sas
PROCS OK: 1 process with args 'resque-1.24.1', UID = 1005 (sas) | processes=1;1:;;0;

Here's what's kind of confusing me... In /etc/nagios/nrpe.cfg I have

log_facility=daemon (how do I change it to log to a file like /var/log/nrpe.cfg instead of messages?)
debug=1

However, I'm not getting any info when a check is run whether it's successful or not. Here's all it has...

-bash-4.1# tail -100 /var/log/messages
Jun 9 03:12:01 mastervm1 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1045" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Jun 11 13:34:41 mastervm1 nrpe[1126]: Caught SIGTERM - shutting down...
Jun 11 13:34:41 mastervm1 nrpe[1126]: Daemon shutdown
Jun 11 13:34:42 mastervm1 nrpe[13501]: Starting up daemon
Jun 11 13:34:42 mastervm1 nrpe[13501]: Warning: Daemon is configured to accept command arguments from clients!
Jun 11 13:34:42 mastervm1 nrpe[13501]: Listening for connections on port 5666
Jun 11 13:34:42 mastervm1 nrpe[13501]: Allowing connections from: 127.0.0.1 10.x.x.x
Jun 12 10:52:28 mastervm1 nrpe[13501]: Caught SIGTERM - shutting down...
Jun 12 10:52:28 mastervm1 nrpe[13501]: Daemon shutdown
Jun 12 10:52:29 mastervm1 nrpe[32468]: INFO: SSL/TLS initialized. All network traffic will be encrypted.
Jun 12 10:52:29 mastervm1 nrpe[32469]: Starting up daemon
Jun 12 10:52:29 mastervm1 nrpe[32469]: Warning: Daemon is configured to accept command arguments from clients!
Jun 12 10:52:29 mastervm1 nrpe[32469]: Listening for connections on port 5666
Jun 12 10:52:29 mastervm1 nrpe[32469]: Allowing connections from: 127.0.0.1 10.x.x.x

On the nagios server it has more... Here's what i grab from /var/log/messages on the actual server (unrelated lines removed)

Jun 12 10:58:33 mon-01 nagios: SERVICE ALERT: mastervm1;check_user_procs;UNKNOWN;SOFT;1;check_procs: User name was not found - sas
Jun 12 10:59:02 mon-01 snmpd[26595]: Connection from UDP: [127.0.0.1]:49364->[127.0.0.1]
Jun 12 10:59:33 mon-01 nagios: SERVICE ALERT: mastervm1;check_user_procs;UNKNOWN;SOFT;2;check_procs: User name was not found - sas
Jun 12 11:00:01 mon-01 snmpd[26595]: Connection from UDP: [127.0.0.1]:38739->[127.0.0.1]
Jun 12 11:00:33 mon-01 nagios: SERVICE ALERT: mastervm1;check_user_procs;UNKNOWN;SOFT;3;check_procs: User name was not found - sas


So I'm kind of at a loss. I'm leaving this company on Friday & they would like me to have this finished before i leave. If I absolutely have to I'll leave it with a check for the procs without the user. And just check for 1 proc per user (which is all that should be running) so they'll still get a notification if one drops out. But, if there are ever a couple resques spun up for each user it could be an issue.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: check_procs says user doesn't exist. Works manually

Post by abrist »

puppynut5 wrote:# Check Procs User
define command{
command_name check_user_procs
command_line $USER1$/check_procs -w $ARG1$ -a $ARG2$ -u $ARG3$
}
To check this through nrpe, you have to add nrpe to the command. The check works fine locally, so edit your check to:

Code: Select all

# Check Procs User
define command{
command_name check_user_procs
command_line $USER1$/check_nrpe -c check_procs -w $ARG1$ -a $ARG2$ -u $ARG3$
}
What does the remote nrpe.cfg command definition for "check_procs" look like?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
puppynut5
Posts: 10
Joined: Tue Jun 11, 2013 3:15 pm

Re: check_procs says user doesn't exist. Works manually

Post by puppynut5 »

I'll post that all soon... But is there any other way to use Nagios to check the processes without it having to be run locally? I looked through some other plugins but didn't really see anything that mentioned it.

I talked to the engineers & since there is only supposed to be one resque worker per username, it will be fine if I can just check to make sure a certain # of them are running with the argument resque-1.24....

However, I've tried it even with just that and it says "Can't find any processes with argument resque..."

Thanks!
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: check_procs says user doesn't exist. Works manually

Post by abrist »

puppynut5 wrote:But is there any other way to use Nagios to check the processes without it having to be run locally?
Not really. Local checks or snmp. I guess there might be a weird gkrellm2 solution, but that would be a hack.
puppynut5 wrote:However, I've tried it even with just that and it says "Can't find any processes with argument resque..."
What is the full error?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
puppynut5
Posts: 10
Joined: Tue Jun 11, 2013 3:15 pm

Re: check_procs says user doesn't exist. Works manually

Post by puppynut5 »

Ok, so I've setup a new host where I compiled everything manually like I usually do, so I'm quite sure it's an issue with how I setup the check.

Here is my complete remote host nrpe.cfg file that I've setup as well as my complete host setup file. If you could let me know if somethings out of order that would be great... So I'm not spamming the forums I'm putting it on my google docs. The check I created in nrpe.cfg is the one called check_user_procs.

Thanks! I hope you find some stupid mistake.


Remote NRPE (Host nagiostest)
http://goo.gl/nbSn5


Host Config File
http://goo.gl/YZIg5
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: check_procs says user doesn't exist. Works manually

Post by abrist »

You nrpe.cfg file is not set up correctly for the command:

Code: Select all

# Check Procs User
define command{
command_name check_user_procs
command_line $USER1$/check_nrpe -c check_procs -w $ARG1$ -a $ARG2$ -u $ARG3$
}
You will need to do a couple edits.
First the command name is wrong in the above file, it should be "check_user_procs" so that it matches the nrpe.cfg command name. Additionally, all of the args and their switches should probably be wrapped in a an nrpe argument switch (-a). Also, you need to define a hostname. See the following changes:

Code: Select all

# Check Procs User
define command{
command_name check_user_procs
command_line $USER1$/check_nrpe -H $HOSTNAME$ -c check_user_procs -a '-w $ARG1$ -a $ARG2$ -u $ARG3$'
}

Code: Select all

command[check_user_procs]=/usr/local/nagios/libexec/check_procs $ARG1$
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Locked