Page 1 of 1

NRPE: Unable to read output

Posted: Mon May 11, 2020 8:06 am
by raosri1992
Hello Team,

I'm trying to set up monitoring of Kubectl cluster using this article https://github.com/colebrooke/kubernetes-nagios

I came across a weird Issues. I have the Kubectl cluster running on Remote RHEL server. When I try to run the scripts locally using NRPE it works.
From Remote server locally.
/usr/local/nagios/libexec/check_pods.sh -k -n -w 500 -C 800
OK - pods are all OK, found 2 in state.

Same command using nrpe plugin on remote server too
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_pod_cjoc
NRPE: Unable to read output

So I have defined a command definition in nrpe.cfg & restarted NRPE agent on the Remote server.

When I try to invoke this script from Nagios server. I'm getting "NRPE: Unable to read output" error.

From Nagios Server
/usr/local/nagios/libexec/check_nrpe -H -c check_pod_cjoc
NRPE: Unable to read output

I have tested with two versions of NRPE agent i.e 3.2.1 & 4.0.3, I didn't try with other versions, but getting same error message

Note: Nagios user has admin(sudo) rights to run these scripts on Remote server.

Nagios running is running on v4.4.5 on RHEL server.

Let me know if you need more information. Can you guys please look at it. @ericloyd @sawolf
######Stay Home#######Stay Safe#########
Thanks,
Srikanth

Re: NRPE: Unable to read output

Posted: Tue May 12, 2020 9:00 am
by raosri1992
I found this in NRPE log file

[1589291681] Host X.X.X.X is asking for command 'check_pod_cjoc' to be run...
[1589291681] Running command: /usr/local/nagios/libexec/check_pods.sh -k config -n cloudbees-core -w 500 -C 800
[1589291681] WARNING: my_system() seteuid(0): Operation not permitted
[1589291681] Command completed with return code 3 and output:
[1589291681] Return Code: 3, Output: NRPE: Unable to read output
[1589291681] Connection from X.X.X.X closed.

Let me know if you need more information

Re: NRPE: Unable to read output

Posted: Tue May 12, 2020 3:21 pm
by tgriep
When the NRPE agent runs a plugin and it does not output any data, the NRPE agent does not have any thing to use to return to the nagios server so it returns this message.
"NRPE: Unable to read output"

We need to figure out why the plugin is not returning any data so lets try running it on the remote host as the nagios user.

Run this on the remote system to change to the nagios user.

Code: Select all

su - nagios
Then run the plugin to see if it works or not.

Code: Select all

/usr/local/nagios/libexec/check_pods.sh -k -n -w 500 -C 800
Post the output.

Another thing, if the plugin requires root to run, add sudo to the command in the nrpe.cfg file and add an entry in the /etc/sudoers like the following example.

Code: Select all

nagios ALL=NOPASSWD: /usr/local/nagios/libexec/check_pods.sh
The script uses an application called jq to run, you may need to add that to the sudoers file.

Re: NRPE: Unable to read output

Posted: Tue May 12, 2020 5:17 pm
by raosri1992
Hello,

Thank you for quick response.

As I mentioned earlier, this command works on the remote server "/usr/local/nagios/libexec/check_pods.sh -k -n -w 500 -C 800" and give me the correct output.

Also I have added nagios user in sudoers file too. The Issue which I'm seeing is from Nagios server. When I try to invoke check_pods.sh script from Nagios server I get this error "NRPE: Unable to read output"

Output from Nagios Server:

/usr/local/nagios/libexec/check_nrpe -H <Remote-host> -c check_pod_cjoc
NRPE: Unable to read output

I'm not sure the command definition from nagios server is not working from Nagios server. Let me know if you need more information.

If you're available, we can get on a call to find RCA.

Thanks,
Srikanth
tgriep wrote:When the NRPE agent runs a plugin and it does not output any data, the NRPE agent does not have any thing to use to return to the nagios server so it returns this message.
"NRPE: Unable to read output"

We need to figure out why the plugin is not returning any data so lets try running it on the remote host as the nagios user.

Run this on the remote system to change to the nagios user.

Code: Select all

su - nagios
Then run the plugin to see if it works or not.

Code: Select all

/usr/local/nagios/libexec/check_pods.sh -k -n -w 500 -C 800
Post the output.

Another thing, if the plugin requires root to run, add sudo to the command in the nrpe.cfg file and add an entry in the /etc/sudoers like the following example.

Code: Select all

nagios ALL=NOPASSWD: /usr/local/nagios/libexec/check_pods.sh
The script uses an application called jq to run, you may need to add that to the sudoers file.

Re: NRPE: Unable to read output

Posted: Tue May 12, 2020 9:47 pm
by raosri1992
Hi,

As you said "When the NRPE agent runs a plugin and it does not output any data, the NRPE agent does not have any thing to use to return to the nagios server so it returns this message." I got a doubt so when I run this command on Remote server as this command the I get the correct ouput.

From Remote server locally.
/usr/local/nagios/libexec/check_pods.sh -k -n -w 500 -C 800
OK - pods are all OK, found 2 in state.


Next I defined a command definition in nrpe.cfg file as " command[check_pod_cjoc]= /usr/local/nagios/libexec/check_pods.sh -k config -n cloudbees-core -w 500 - C 800 and restarted NRPE agent on remote server.

Now I have defined a new service on NAgios service and restarted the NAgios. I'm getting the issue "Re: NRPE: Unable to read output"

From Nagios Server

/usr/local/nagios/libexec/check_nrpe -H X.X.x.X -c check_pod_cjoc
NRPE: Unable to read output



Let me know if the information above is not so clear.

Thanks,
Srikanth

Re: NRPE: Unable to read output

Posted: Wed May 13, 2020 9:14 am
by tgriep
When you ran the command on the remote server, did you run it while logged is as the nagios user?

Next, the command line options are different in the nrpe.cfg file than what you ran, can you re-run the command as the nagios user that is defined in the nrpe.cfg file?

Re: NRPE: Unable to read output

Posted: Wed May 13, 2020 11:59 am
by raosri1992
I'm trying to execute the scripts by nagios user only.

The output I have posted earlier (from Remote server, Nagios server) are from nagios user.


tgriep wrote:When you ran the command on the remote server, did you run it while logged is as the nagios user?

Next, the command line options are different in the nrpe.cfg file than what you ran, can you re-run the command as the nagios user that is defined in the nrpe.cfg file?

Re: NRPE: Unable to read output

Posted: Wed May 13, 2020 1:53 pm
by raosri1992
Hi,

I don't see any problems for other commands which I have defined in nrpe.cfg for check_load,check_mem,Check_disk.

I'm getting this problem for the scripts written in this repository.

https://github.com/colebrooke/kubernetes-nagios

Let me know if you need more information

Thanks,
Srikanth

Re: NRPE: Unable to read output

Posted: Thu May 14, 2020 1:57 pm
by raosri1992
Hi,

Till yesterday, I was passing parameters to the command definition in nrpe.cfg in Remote server which is below:

"command[check_pod_cjoc]=/usr/local/nagios/libexec/check_pods.sh -k /.kube/config -n cloudbees-core "

When I try from Nagios server, I was getting this error "NRPE: Unable to read output" which is below:

[nagios@nagioserver nagios]$ /usr/local/nagios/libexec/check_nrpe -H Remote-Server -c check_pod_cjoc
NRPE: Unable to read output


Today I tried something different. I tried passing the parameters as arguments in nrpe.cfg file in remote server which is below:

"command[check_pod]=/usr/local/nagios/libexec/check_test.sh -k $ARG1$ -n $ARG2$"
Changed the dont_blame_nrpe is set to 1

Restarted NRPE service on remote server.


When I try from Nagios server, I'm getting this error now which is different.

[nagios@nagios-host ~]$ /usr/local/nagios/libexec/check_nrpe -H remote-host -c check_pod -a /home/ec2-user/.kube/config cloudbees-core
NRPE: Command 'check_pod!/home/ec2-user/.kube/config!cloudbees-core' not defined


Note: check_test.sh & Check_pods.sh are same scripts which are copied from this repository "https://github.com/colebrooke/kubernetes-nagios" which same as check_kube_pods.sh script.

I have tried every work-around to get that command working but I don't what is wrong. I couldn't figure out the root cause why nagios is unable to capture the output when ever the check_pods.sh run on remote server.

I'm going to paste the ideal output of the script when I execute it in locally on remote server.

[nagios@remote-server ~]$ /usr/local/nagios/libexec/check_test.sh -k /h/.kube/config -n cloudbees-core -v
OK - pods are all OK, found 3 in ready state.
OK: Pod: ny-master-0 PodScheduled: True
OK: Pod: ny-master-0 ContainersReady: True
OK: Pod: ny-master-0 Ready: True
OK: Pod: ny-master-0 Initialized: True
OK: Pod: heist-master-2-0 PodScheduled: True
OK: Pod: heist-master-2-0 ContainersReady: True
OK: Pod: heist-master-2-0 Ready: True
OK: Pod: heist-master-2-0 Initialized: True
OK: Pod: cjoc-0 PodScheduled: True
OK: Pod: cjoc-0 ContainersReady: True
OK: Pod: cjoc-0 Ready: True
OK: Pod: cjoc-0 Initialized: True
[nagios@remote-server ~]$


Let me know if anyone needs more information.

Thanks,
Srikanth
tgriep wrote:When you ran the command on the remote server, did you run it while logged is as the nagios user?

Next, the command line options are different in the nrpe.cfg file than what you ran, can you re-run the command as the nagios user that is defined in the nrpe.cfg file?