Page 1 of 1

check_nrpe dosnt excute (pcs commands) on remote server

Posted: Mon Feb 15, 2021 11:46 am
by fsodah
Hello,

I have a script which is responsible for executing multiple commands to check cluster behaviour like (pcs status). this script is installed on remote servers and it works fine locally. but when I try to execute this script via check_nrpe from Nagios server the scripts run without executing the "pcs status " command.

I think it's related to some permission.

can you please help?

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Mon Feb 15, 2021 3:15 pm
by tgriep
Try this, edit the nrpe.cfg file on the remote host and change the following from
debug=0
to

Code: Select all

debug=1
Save the change and restart the NRPE agent.

Check the /var/log/messages file for any errors when the plugin is ran on the remote system and post them here.

Also, post the command that you defined in the nrpe.cfg file on the remote server and the script you are trying to run so we can see what it is doing.

One more thing, the NRPE agent runs the commands as the nagios user so make sure it can run the applications needed to gather the data and that is can find the commands on the path.

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Tue Feb 16, 2021 1:19 am
by fsodah
the client output /var/log/messages when executing the NRPE script from Nagios server :

Feb 16 08:09:21 dascsdbo00001b nrpe[14016]: CONN_CHECK_PEER: checking if host is allowed: 10.1.23.222 port 20100
Feb 16 08:09:21 dascsdbo00001b nrpe[14016]: is_an_allowed_host (AF_INET): is host >10.1.23.222< an allowed host >10.1.23.222<
Feb 16 08:09:21 dascsdbo00001b nrpe[14016]: is_an_allowed_host (AF_INET): is host >10.1.23.222< an allowed host >10.1.23.222<
Feb 16 08:09:21 dascsdbo00001b nrpe[14016]: is_an_allowed_host (AF_INET): host is in allowed host list!
Feb 16 08:09:21 dascsdbo00001b nrpe[14017]: WARNING: my_system() seteuid(0): Operation not permitted
Feb 16 08:09:21 dascsdbo00001b dbus[1097]: [system] Activating service name='org.fedoraproject.Setroubleshootd' (using servicehelper)
Feb 16 08:09:21 dascsdbo00001b dbus[1097]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Feb 16 08:09:21 dascsdbo00001b setroubleshoot: SELinux is preventing /usr/bin/python2.7 from execute access on the file /usr/sbin/corosync. For complete SELinux messages run: sealert -l 147ebee6-e792-44a6-b763-7ec6c5992f0a
Feb 16 08:09:21 dascsdbo00001b python: SELinux is preventing /usr/bin/python2.7 from execute access on the file /usr/sbin/corosync.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that python2.7 should be allowed execute access on the corosync file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'pcs' --raw | audit2allow -M my-pcs#012# semodule -i my-pcs.pp#012
Feb 16 08:09:22 dascsdbo00001b setroubleshoot: SELinux is preventing check_cluster.s from getattr access on the file /usr/bin/sudo. For complete SELinux messages run: sealert -l eec466b1-71b7-4d18-b8a6-545868e07d17
Feb 16 08:09:22 dascsdbo00001b python: SELinux is preventing check_cluster.s from getattr access on the file /usr/bin/sudo.#012#012***** Plugin catchall_boolean (89.3 confidence) suggests ******************#012#012If you want to allow nagios to run sudo#012Then you must tell SELinux about this by enabling the 'nagios_run_sudo' boolean.#012#012Do#012setsebool -P nagios_run_sudo 1#012#012***** Plugin catchall (11.6 confidence) suggests **************************#012#012If you believe that check_cluster.s should be allowed getattr access on the sudo file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'check_cluster.s' --raw | audit2allow -M my-checkclusters#012# semodule -i my-checkclusters.pp#012
Feb 16 08:09:22 dascsdbo00001b setroubleshoot: SELinux is preventing check_cluster.s from getattr access on the file /usr/bin/sudo. For complete SELinux messages run: sealert -l eec466b1-71b7-4d18-b8a6-545868e07d17
Feb 16 08:09:22 dascsdbo00001b python: SELinux is preventing check_cluster.s from getattr access on the file /usr/bin/sudo.#012#012***** Plugin catchall_boolean (89.3 confidence) suggests ******************#012#012If you want to allow nagios to run sudo#012Then you must tell SELinux about this by enabling the 'nagios_run_sudo' boolean.#012#012Do#012setsebool -P nagios_run_sudo 1#012#012***** Plugin catchall (11.6 confidence) suggests **************************#012#012If you believe that check_cluster.s should be allowed getattr access on the sudo file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'check_cluster.s' --raw | audit2allow -M my-checkclusters#012# semodule -i my-checkclusters.pp#012



the command that you defined in the nrpe.cfg:
command[check_cluster_env]=/usr/lib64/nagios/plugins/check_cluster.sh $ARG1$

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Tue Feb 16, 2021 10:50 am
by tgriep
Selinux is enabled on the remote server and is blocking the plugin from running.
In the output from the /var/log/messages file, it ezplaigns that is it blocked and it has the commands to enable it to by-pass selinux.
Here are the commands to bypass selinux.

Run them as root on the remote server.

Code: Select all

#012# ausearch -c 'pcs' --raw | audit2allow -M my-pcs#012# semodule -i my-pcs.pp#012
#012# ausearch -c 'check_cluster.s' --raw | audit2allow -M my-checkclusters#012# semodule -i my-checkclusters.pp#012
#012# ausearch -c 'check_cluster.s' --raw | audit2allow -M my-checkclusters#012# semodule -i my-checkclusters.pp#012
Then see if you can run the NRPE command from the Nagios server.

If it does not run, check the /var/log/messages file for any new log entries for selinux blocking the plugin.
Run the commands to bypass it until the plugin runs.

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Tue Feb 16, 2021 12:28 pm
by fsodah
Thanks for your reply,
after applying all SElinux commands, still, the NRPE script doesn't work well,

The output of /var/log/messages on the client-side is :
Feb 16 19:20:56 dascsdbo00001b nrpe[14227]: CONN_CHECK_PEER: checking if host is allowed: 10.1.23.222 port 43245
Feb 16 19:20:56 dascsdbo00001b nrpe[14227]: is_an_allowed_host (AF_INET): is host >10.1.23.222< an allowed host >10.1.23.222<
Feb 16 19:20:56 dascsdbo00001b nrpe[14227]: is_an_allowed_host (AF_INET): is host >10.1.23.222< an allowed host >10.1.23.222<
Feb 16 19:20:56 dascsdbo00001b nrpe[14227]: is_an_allowed_host (AF_INET): host is in allowed host list!
Feb 16 19:20:56 dascsdbo00001b nrpe[14228]: WARNING: my_system() seteuid(0): Operation not permitted
Feb 16 19:20:56 dascsdbo00001b systemd: Started Session c164372 of user root.
Feb 16 19:20:56 dascsdbo00001b dbus[1097]: [system] Activating service name='org.fedoraproject.Setroubleshootd' (using servicehelper)
Feb 16 19:20:57 dascsdbo00001b dbus[1097]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Feb 16 19:20:57 dascsdbo00001b setroubleshoot: Exception during AVC analysis: must be encoded string without NULL bytes, not str
Feb 16 19:21:00 dascsdbo00001b setroubleshoot: Exception during AVC analysis: must be encoded string without NULL bytes, not st

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Tue Feb 16, 2021 1:14 pm
by tgriep
Post the full nrpe.cfg file and the /usr/lib64/nagios/plugins/check_cluster.sh script so we can look at it.

Run this as root on the remote system and post the output.

Code: Select all

ps -ef --cols=300 |grep nrpe
If the plugin requires root permissions to run, try doing this.

Edit the /etc/sudoers file and add the following entry

Code: Select all

nrpe ALL=NOPASSWD: /usr/lib64/nagios/plugins/check_cluster.sh
You may need to add a line to the pcs command in the sudoers file as well.
Here is an example. Make sure you update the path to the pcs command.

Code: Select all

nrpe ALL=NOPASSWD: /usr/bin/pcs
Next edit the nrpe.cfg file and add sudo to it.

Code: Select all

command[check_cluster_env]=sudo /usr/lib64/nagios/plugins/check_cluster.sh $ARG1$
Save the change and restart the nrpe service and see if that helps.

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Tue Feb 16, 2021 1:59 pm
by fsodah
The output of (ps -ef --cols=300 |grep nrpe) on the client side :
nrpe 3799 1 0 20:09 ? 00:00:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f
root 57798 57036 0 20:51 pts/1 00:00:00 grep --color=auto nrpe

i added nrpe to sudors as you suggested, but now when call the script from Nagios server using NRPE :

[root@nagios libexec]# ./check_nrpe -H XX.XX.XX.XX -c check_cluster_env -a 'Stonith'
NRPE: Unable to read output



attached nrpe.cfg and check_cluster.sh scripts

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Tue Feb 16, 2021 4:14 pm
by tgriep
Login to the remote server and change to the nrpe user by running the following.

Code: Select all

su - nrpe
Then run this to see what the output of the plugin is when it runs.

Code: Select all

bash -x /usr/lib64/nagios/plugins/check_cluster.sh Stonith
echo $?
/usr/sbin/pcs stonith show
Post all of the output.

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Wed Feb 17, 2021 1:43 am
by fsodah
Please note that nrpe is no login user
nrpe:x:996:992:NRPE user for the NRPE service:/var/run/nrpe:/sbin/nologin

so we cannot switch to that user


- so i did the following:
sudo -s -u nrpe
- NOW IAM nrpe user

- The output of (bash -x /usr/lib64/nagios/plugins/check_cluster.sh Stonith) is:
bash-4.2$ bash -x /usr/lib64/nagios/plugins/check_cluster.sh Stonith
+ CRM=/usr/sbin/pcs
+ CRMV=/usr/sbin/crm_verify
+ STATE_OK=0
+ STATE_WARNING=1
+ STATE_CRITICAL=2
+ STATE_UNKNOWN=3
+ '[' 1 -lt 1 ']'
++ /usr/sbin/pcs config show
++ grep -A1 'Corosync Nodes:'
++ tail -n1
Error: error running crm_mon, is pacemaker running?
+ nodelist=
+ case $1 in
+ checkstonith
++ sudo /usr/sbin/pcs stonith show
++ grep -i Started
++ wc -l

We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

#1) Respect the privacy of others.
#2) Think before you type.
#3) With great power comes great responsibility.



- the Output of (echo $?) is:
1

- the output of (/usr/sbin/pcs stonith show)

bash-4.2$ /usr/sbin/pcs stonith show
Error: unable to get cluster status from crm_mon

Error: cluster is not available on this nod
e

Re: check_nrpe dosnt excute (pcs commands) on remote server

Posted: Wed Feb 17, 2021 10:09 am
by tgriep
Since the plugin is using a shell, the nrpe user has to be able to login.

Change this from

Code: Select all

nrpe:x:996:992:NRPE user for the NRPE service:/var/run/nrpe:/sbin/nologin
to

Code: Select all

nrpe:x:996:992:NRPE user for the NRPE service:/var/run/nrpe:/bin/bash
Next, you need to figure out why you cannot get the Cluster status when running this.
/usr/sbin/pcs stonith show
Error: unable to get cluster status from crm_mon
Error: cluster is not available on this node