Page 2 of 4

Re: jstat is failing to show a proper result

Posted: Mon Feb 17, 2020 1:35 am
by runzelpunzel

Code: Select all

nrpe_user=nagios
nrpe_group=nagios
Permissions for user nagios are configured as follows in /etc/sudoers (temporarily for debugging purposes):

Code: Select all

Defaults:nagios !requiretty
nagios ALL=(ALL) NOPASSWD: ALL

Re: jstat is failing to show a proper result

Posted: Mon Feb 17, 2020 8:33 pm
by Box293
Honestly I'm at a loss as to what is going on. I suspect the nagios user is not getting the same environment when it runs through NRPE compared to when it runs at the command line. I would do some tests to echo the environment and output it when executed via NRPE and compare that to when you execute from the command line.

Re: jstat is failing to show a proper result

Posted: Tue Feb 18, 2020 2:20 am
by runzelpunzel
Thanks for your reply and looking into this again!

I have done what you're asking for right from the beginning, as this was my first thought, see my 2nd post:

Code: Select all

command[check_sudo_test2]=sudo env

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H x.x.x.x -c check_sudo_test2
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MAIL=/var/mail/root
LOGNAME=root
USER=root
HOME=/root
SHELL=/bin/bash
TERM=unknown
SUDO_COMMAND=/usr/bin/env
SUDO_USER=nagios
SUDO_UID=115
SUDO_GID=122
But that is looking all good to me, especially USER=root as well as PATH=(...):/usr/bin:(...).

Anybody else having any more ideas on what to check for next?

Re: jstat is failing to show a proper result

Posted: Tue Feb 18, 2020 4:29 pm
by Box293
What is the output of these commands when executed directly on the server and not via NRPE?

Perhaps you could try using another agent like NCPA and see if it works that way.

Re: jstat is failing to show a proper result

Posted: Wed Feb 19, 2020 5:36 am
by runzelpunzel
Except for TERM, which shouldn't make any difference to the best of my knowledge, there is none.

Code: Select all

nagios@MYMACHINE[~]> sudo env
LANG=de_DE.UTF-8
TERM=screen
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MAIL=/var/mail/root
LOGNAME=root
USER=root
HOME=/root
SHELL=/bin/bash
SUDO_COMMAND=/usr/bin/env
SUDO_USER=nagios
SUDO_UID=115
SUDO_GID=122
Perhaps you could try using another agent like NCPA and see if it works that way.
Unfortunately this is not an option since we have a global rollout for NRPE. Our entire Monitoring-infrastructure relies on NRPE.

Re: jstat is failing to show a proper result

Posted: Wed Feb 19, 2020 5:03 pm
by mbellerue
Alright, I'm jumping in here, so forgive me if this has been done already. But I'm thinking it's time for some good old fashioned echo debugging.

Throw an echo "test.sh start" and echo "test.sh end" before and after your sudo command in check_test.sh, respectively.

In your command definition in nrpe.cfg throw in a echo "nrpe command begin" || and || echo "nrpe command end" so that your actual check_test.sh script is smushed between them. I believe that should be valid NRPE configuration.

Run your NRPE check command, and check the debug log. See what you get. If the debug log doesn't show anything, try having the echos in check_test.sh write to a file, same way as you're doing with the sudo jstat command.

Also just to confirm, you're starting off using the check_jstat.sh plugin, and then when you move to check_test.sh, check_test.sh is using the straight up jstat command, not the check_jstat.sh plugin. Is that intended?

Edit:
Ack. I initially did & when I meant ||.

Re: jstat is failing to show a proper result

Posted: Fri Feb 21, 2020 2:25 am
by runzelpunzel
Hi @mbellerue,

thanks for jumping in. Any thoughts are greatly appreciated!

check_test.sh does now look as follows:

Code: Select all

#!/bin/bash
echo "test.sh start"
sudo jstat -gc 1314 | tail -1 | sed -e 's/[ ][ ]*/ /g' > /tmp/check_test.log 2>&1
echo "test.sh stop"
exit 5
The appropriate command in nrpe.cfg now reads:

Code: Select all

command[check_test]=echo "nrpe command begin" || sudo /usr/lib/nagios/plugins/check_test.sh || echo "nrpe command end"
When I call check_nrpe with the command check_test, Nagios receives the following result:

Code: Select all

nagios@nagios[/usr/local/nagios/libexec]> ./check_nrpe -H x.x.x.x -c check_test
nrpe command begin
So it does not follow up after the OR ("||"), right?!

In the debug-log of the NRPE-Host I see the following, which is equivalent to what Nagios gets:

Code: Select all

Feb 21 08:07:59 p083 nrpe[23870]: Connection from 1.2.3.4 port 63192
Feb 21 08:07:59 p083 nrpe[23870]: Host address is in allowed_hosts
Feb 21 08:07:59 p083 nrpe[23870]: Host 1.2.3.4 is asking for command 'check_test' to be run...
Feb 21 08:07:59 p083 nrpe[23870]: Running command: echo "nrpe command begin" || sudo /usr/lib/nagios/plugins/check_test.sh || echo "nrpe command end"
Feb 21 08:07:59 p083 nrpe[23870]: Command completed with return code 0 and output: nrpe command begin
Feb 21 08:07:59 p083 nrpe[23870]: Return Code: 0, Output: nrpe command begin
Feb 21 08:07:59 p083 nrpe[23870]: Connection from 1.2.3.4 closed.
/tmp/check_test.log is not being written into.

I changed the check_test.sh in a way to write the output of the echo-commands to /tmp/check_test.log as well:

Code: Select all

#!/bin/bash
echo "test.sh start" >> /tmp/check_test.log 2>&1
sudo jstat -gc 1314 | tail -1 | sed -e 's/[ ][ ]*/ /g' >> /tmp/check_test.log 2>&1
echo "test.sh stop" >> /tmp/check_test.log 2>&1
exit 5
However, the result is the same: No /tmp/check_test.log :shock:
What am I doing wrong here?
Also just to confirm, you're starting off using the check_jstat.sh plugin, and then when you move to check_test.sh, check_test.sh is using the straight up jstat command, not the check_jstat.sh plugin. Is that intended?
Yes, this is intended (for testing purposes). When initially debugging my issue I nailed the problem down to be the call of the jstat-binary.
Thus I wanted to solely focus on getting this sorted.

Thanks again for looking into this. Looking forward to hearing from you soon.

/Dennis

Re: jstat is failing to show a proper result

Posted: Tue Feb 25, 2020 12:59 pm
by scottwilkerson
Are you sure you added /usr/lib/nagios/plugins/check_test.sh to your sudoers file for the nagios user?

Code: Select all

grep nagios /etc/sudoers

Re: jstat is failing to show a proper result

Posted: Wed Feb 26, 2020 12:48 am
by runzelpunzel
Quite from the beginning I enabled "ALL" for nagios for testing purposes:

Code: Select all

grep nagios /etc/sudoers

Defaults:nagios !requiretty
nagios ALL=(ALL) NOPASSWD: ALL

Re: jstat is failing to show a proper result

Posted: Wed Feb 26, 2020 7:55 am
by scottwilkerson
This is so bizarre, what are the permissions of check_test.sh?

Code: Select all

ls -l /usr/lib/nagios/plugins/check_test.sh