Different results for NRPE vs running locally

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
CLee1972
Posts: 20
Joined: Wed Mar 07, 2018 1:53 pm

Different results for NRPE vs running locally

Post by CLee1972 »

Hello,

I built a basic bin file to run JStack and output the results. This scripts runs fine locally but when using NRPE, it fails horribly, even on the local machine. My guess is the "wc" the script is using is not traversing NRPE correctly and this is why it is failing. I have attached a sample of the script below:

Code: Select all

#!/bin/bash
for pid in $( ps -ef | grep java|grep mediaserver | awk '{print $2}')
do
a=`jstack -l $pid | grep "RTPReceiver" | wc -l`
if [ $a -ge 4 ];then
echo "OK - $a RTPReceiverHandles are running!!"
exit 0
else
echo "CRITICAL - $a RTPReceiverHandles are running!!  Media Server Component will need to be restarted!!!"
exit 2
fi
done
The result when running the script locally is:

"OK - 4 RTPReceiverHandles are running!!"

but when I run it with NRPE, even locally using ./check_nrpe -H localhost -c script comes back with:

"CRITICAL - 0 RTPReceiverHandles are running!! Media Server Component will need to be restarted!!!"

I am hoping someone has heard of this and has an easy fix. Thanks in advance for anyone reviewing this.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Different results for NRPE vs running locally

Post by npolovenko »

Hello, @CLee1972. This is likely a permission issue. When nrpe executes the script it uses the "nagios" user, and when you execute it manually it runs under root. Try to change the user on the remote server:

Code: Select all

su - nagios
And then run the command manually.

Chances are that either the java or the mediaserver process will not show up in ps -ef, when the command is being executed by a non root user.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
CLee1972
Posts: 20
Joined: Wed Mar 07, 2018 1:53 pm

Re: Different results for NRPE vs running locally

Post by CLee1972 »

npolovenko wrote:Hello, @CLee1972. This is likely a permission issue. When nrpe executes the script it uses the "nagios" user, and when you execute it manually it runs under root. Try to change the user on the remote server:

Code: Select all

su - nagios
And then run the command manually.

Chances are that either the java or the mediaserver process will not show up in ps -ef, when the command is being executed by a non root user.
Thanks for the super quick response. I did as you suggested and ran
su nagios
. When running ps -ef|grep java, I am getting back the results I am expecting but when running my jstack script locally now, I am getting the following result below
1144: well-known file /tmp/.java_pid1144 is not secure: file should be owned by the current user (which is 1001) but is owned by 0
CRITICAL - 0 RTPReceiverHandles are running!! Media Server Component will need to be restarted!!!
my JStack script is running as nagios.nagios
CLee1972
Posts: 20
Joined: Wed Mar 07, 2018 1:53 pm

Re: Different results for NRPE vs running locally

Post by CLee1972 »

Okay, so I have made some progress with this. I now have the "nagios" user in my sudoers group and I can run my script with the nagios user using "sudo /usr/local/nagios/libexec/check_jstack" with success. The issue I am having now is if I run the "/usr/local/nagios/libexec/check_jstack" command after updating nrpe.cfg either as "command[check_jstack]=/usr/bin/sudo /usr/local/nagios/libexec/check_jstack" or without the /usr/bin and just sudo, the command comes back with "CRITICAL - 0 RTPReceiverHandles are running!! Media Server Component will need to be restarted!!!" failed entry.

What I see now is if I can make "check_nrpe" from the Nagios Core server utilize sudo, I may be able to get this to work, maybe. It is a long shot and any additional advice or options/opinions would be greatly appreciated.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Different results for NRPE vs running locally

Post by scottwilkerson »

CLee1972 wrote:Okay, so I have made some progress with this. I now have the "nagios" user in my sudoers group and I can run my script with the nagios user using "sudo /usr/local/nagios/libexec/check_jstack" with success. The issue I am having now is if I run the "/usr/local/nagios/libexec/check_jstack" command after updating nrpe.cfg either as "command[check_jstack]=/usr/bin/sudo /usr/local/nagios/libexec/check_jstack" or without the /usr/bin and just sudo, the command comes back with "CRITICAL - 0 RTPReceiverHandles are running!! Media Server Component will need to be restarted!!!" failed entry.

What I see now is if I can make "check_nrpe" from the Nagios Core server utilize sudo, I may be able to get this to work, maybe. It is a long shot and any additional advice or options/opinions would be greatly appreciated.
If your command is this

Code: Select all

command[check_jstack]=/usr/bin/sudo /usr/local/nagios/libexec/check_jstack
Then if you call this with check_nrpe it will use sudo
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
CLee1972
Posts: 20
Joined: Wed Mar 07, 2018 1:53 pm

Re: Different results for NRPE vs running locally

Post by CLee1972 »

scottwilkerson wrote:
If your command is this

Code: Select all

command[check_jstack]=/usr/bin/sudo /usr/local/nagios/libexec/check_jstack
Then if you call this with check_nrpe it will use sudo
The issue I am running into currently, Scott is if I use:
"[root@ip-0-0-0-0 libexec]# /usr/local/nagios/libexec/check_jstack" on the local machine, I get the following result:
"OK - 4 RTPReceiverHandles are running!!"

If I go over the the Nagios Core machine and run:
"[root@ip-#-#-#-# plugins]# ./check_nrpe -H 0.0.0.0 -c check_jstack" (Server IP changed, of course), it spins for about 5-7 seconds and then comes back with the following result:
"CRITICAL - 0 RTPReceiverHandles are running!! Media Server Component will need to be restarted!!!"

I am not sure if I have to have JStack just write a log and then have NRPE read from the .log file it creates or what? (I have also tried this using check_log and it comes back with "OK-0 entries found results" which is really odd because I have opened the log it creates and found the 4 results I was expecting in the log.)
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Different results for NRPE vs running locally

Post by scottwilkerson »

In your script /usr/local/nagios/libexec/check_jstack make sure you use full paths to all of the executables, the nagios user when running these may not have the same PATH as you when logged in

Most specifically this line,

Code: Select all

a=`jstack -l $pid | grep "RTPReceiver" | wc -l`
replace with

Code: Select all

a=`/path/to/jstack -l $pid | grep "RTPReceiver" | wc -l`
If you don't know the path run

Code: Select all

which jstack
and that should give you the full path
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
CLee1972
Posts: 20
Joined: Wed Mar 07, 2018 1:53 pm

Re: Different results for NRPE vs running locally

Post by CLee1972 »

Yeah, I just ran which jstack which let me know it was located in /bin/jstack. I love how complex it is being. When I put that into the script, restarted NRPE and run it locally, it worked. Running from Nagios Core, it gave me a big :P . I am thinking this may be a permissions issue, I am just not sure where to start. I have made sure NRPE is running properly by running check_nrpe!check_users with proper results so it is not communication?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Different results for NRPE vs running locally

Post by scottwilkerson »

You may require this in your sudoers file as well if you don't have it in there

Code: Select all

Defaults:nagios !requiretty
If that is't it, all I can suggest would be to start putting debug code in your script to see why it isn't returning what you expect.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked