Communication error between Oracle and Nagios

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
pdelgado1989
Posts: 6
Joined: Thu Nov 11, 2021 2:52 pm

Communication error between Oracle and Nagios

Post by pdelgado1989 »

I'm trying to adapt some Perl scripts to Bash to run them in Nagios XI.

The complete script, so far, is this one:

Code: Select all

. /home/oracle/.profile_RAC
ORACLE_HOME=/oracle/app/grid/19300
ORACLE_BASE=/oracle/app/base
nagios_exit_codes=('UNKNOWN', 3, 'OK', 0, 'WARNING', 1, 'CRITICAL', 2)
status='OK'
ok=1
action=$1

case $action in
        "votedisk")
                #command=`/oracle/app/grid/19300/bin/crsctl query css votedisk | grep asm`
                #command=$(/oracle/app/grid/19300/bin/crsctl query css votedisk)
                command=`/oracle/app/grid/19300/bin/crsctl query css votedisk`

                case $comando in
                        *"failed"*|*"OFFLINE"*|*"PROC"*)
                                status='CRITICAL'
                                output_msg="Voting disk status check failed!"
                        ;;

                        * )
                                output_msg="Voting disks status check succeeded"
                        ;;
                esac

                output="[$status] $output_msg - $command"

        ;;

        "clusterstatus")
                comando=`/oracle/app/grid/19300/bin/crsctl query crs releaseversion`
                output_msg="All clusterware services are up (clusterware version: $comando)"
                output="$output_msg"

        ;;
esac


echo -e $output
exit 0
Running this script locally, the result is this:

Code: Select all

[root@bbddmachine plugins]# sh ./script_prueba.sh votedisk
[OK] Voting disks status check succeeded - ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 8dfc2a9528244f95bf87bb394e793995 (/dev/mapper/asm_ocr1) [OCR] Located 1 voting disk(s).
But on the Nagios machine, the result is wrong:

Code: Select all

[nagios@ng1esp libexec]$ ./check_nrpe -2 -H 172.47.62.12 -t 60 -c check_crs_votedisk
[OK] Voting disks status check succeeded - Unable to communicate with the Cluster Synchronization Services daemon.
However, if I launch the other option in the script called clusterstatus everything works fine:

Code: Select all

[nagios@ng1esp libexec]$ ./check_nrpe -2 -H 172.47.62.12 -t 60 -c check_crs_clusterstatus
All clusterware services are up (clusterware version: Oracle High Availability Services release version on the local node is [19.0.0.0.0])
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Communication error between Oracle and Nagios

Post by ssax »

The proper way to test on bbddmachine is to do this:

Code: Select all

su - nagios
# cd into the plugins directory
sh ./script_prueba.sh votedisk
If that doesn't work, add a -x to it and send the output:

Code: Select all

sh -x ./script_prueba.sh votedisk
The assumption is that nagios doesn't have permissions for something, you may need to run it through sudo if you're unable to adjust the permissions.
pdelgado1989
Posts: 6
Joined: Thu Nov 11, 2021 2:52 pm

Re: Communication error between Oracle and Nagios

Post by pdelgado1989 »

Hi @ssax.

Following your indications, I have executed in bbddmachine with the user nrpe (equivalent to the user nagios). The output is the following, which is successful execution:

Code: Select all

[root@bbddmachine plugins]# sudo su - nrpe
[nrpe@bbddmachine plugins]$ sh ./script_prueba.sh votedisk
[OK] Voting disks status check succeeded - ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 8dfc2a9528244f95bf87bb394e793995 (/dev/mapper/asm_ocr1) [OCR] Located 1 voting disk(s).
But the execution on the Nagios machine still does not work:

Code: Select all

[nagios@ng1esp libexec]$ ./check_nrpe -2 -H 172.27.68.132 -t 60 -c check_crs_votedisk
[OK] Voting disks status check succeeded - Unable to communicate with the Cluster Synchronization Services daemon.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Communication error between Oracle and Nagios

Post by benjaminsmith »

Hi,

Try testing this once again as the nagios user account ( instead of nrpe) and let us know if you get different results. Thanks.

Code: Select all

su - nagios
# cd into the plugins directory
sh ./script_prueba.sh votedisk
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
pdelgado1989
Posts: 6
Joined: Thu Nov 11, 2021 2:52 pm

Re: Communication error between Oracle and Nagios

Post by pdelgado1989 »

Hi.

The user that we have defined in /etc/nagios/nrpe.cfg to execute the scripts is nrpe:

Code: Select all

# NRPE USER
# This determines the effective user that the NRPE daemon should run as.
# You can either supply a username or a UID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_user=nrpe

# NRPE GROUP
# This determines the effective group that the NRPE daemon should run as.
# You can either supply a group name or a GID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_group=nrpe
So we run the script locally with the user nrpe.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Communication error between Oracle and Nagios

Post by benjaminsmith »

Hi,

Do you recall if you installed this from source or using the Linux Agent installer?

Also, please post the output to the following command.

Code: Select all

cat /etc/xinetd.d/nrpe
Thanks,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
pdelgado1989
Posts: 6
Joined: Thu Nov 11, 2021 2:52 pm

Re: Communication error between Oracle and Nagios

Post by pdelgado1989 »

Hi.

It was installed via rpm package. The packages installed are:

nagios-common-4.4.5-1.el8.x86_64.rpm
nrpe-4.0.3-1.el8.x86_64.rpm
nagios-plugins-2.3.3-4.el8.x86_64.rpm
nagios-plugins-disk-2.3.3-4.el8.x86_64.rpm
nagios-plugins-load-2.3.3-4.el8.x86_64.rpm
nagios-plugins-procs-2.3.3-4.el8.x86_64.rpm
nagios-plugins-swap-2.3.3-4.el8.x86_64.rpm
nagios-plugins-users-2.3.3-4.el8.x86_64.rpm

OS version is Red Hat Enterprise Linux release 8.0 (Ootpa)

For the command

Code: Select all

cat /etc/xinetd.d/nrpe
we do not have the xinetd.d package installed.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Communication error between Oracle and Nagios

Post by benjaminsmith »

Hi,

The yum install is not maintained by Nagios so it setups the agent up a little differently than our installer script ( see: https://assets.nagios.com/downloads/nag ... _Agent.pdf ).

Let's check the permission on the plugin, I believe those would be in the following directory but you may have to modify the command below to your system.

Code: Select all

 ls -l /usr/lib64/nagios/plugins
My system looks like this:

Code: Select all

[root@localhost plugins]# ls -l /usr/lib64/nagios/plugins
total 256
-rwxrwxr-x. 1 root root 110320 Apr  2  2021 check_http
-rwxrwxr-x. 1 root root  55328 Apr  2  2021 check_load
drwxr-xr-x. 2 root root      6 Mar  7  2021 eventhandlers
-rwxr-xr-x. 1 root root  42760 Apr  2  2021 negate
-rwxr-xr-x. 1 root root  42528 Apr  2  2021 urlize
-rwxr-xr-x. 1 root root   2791 Apr  2  2021 utils.sh
Also, can you pm the nrpe.cfg file from the system, I'd like to check the command definitions as well.

Thanks,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
pdelgado1989
Posts: 6
Joined: Thu Nov 11, 2021 2:52 pm

Re: Communication error between Oracle and Nagios

Post by pdelgado1989 »

Hi.

My system looks like this:

Code: Select all

[root@bdx1edara plugins]# ls -la
total 444
drwxr-xr-x. 3 root root   267 nov 24 12:57 .
drwxr-xr-x. 3 root root    21 sep  2  2020 ..
-rw-r--r--  1 root root  8532 nov  9 17:04 1
-rwxrwxr-x  1 root root  9842 nov 12 13:20 check_crs
-rwxrwxr-x  1 root root  8730 nov 12 12:04 check_crs_bkp
-rwxrwxr-x. 1 root root 94104 jun 30  2020 check_disk
-rwxrwxr-x. 1 root root 55312 jun 30  2020 check_load
-rwxrwxr-x. 1 root root  3418 sep  2  2020 check_mem
-rwxrwxr-x. 1 root root 64112 jun 30  2020 check_procs
-rwxrwxr-x. 1 root root 47056 jun 30  2020 check_swap
-rwxrwxr-x. 1 root root 42840 jun 30  2020 check_users
drwxr-xr-x. 2 root root     6 ago 29  2019 eventhandlers
-rw-r--r--  1 root root   184 nov 16 13:09 fich.tmp
-rwxr-xr-x. 1 root root 42736 jun 30  2020 negate
-rwxrwxrwx  1 root root  2741 nov 17 14:39 script_prueba.sh
-rwxr-xr-x. 1 root root 42520 jun 30  2020 urlize
-rwxr-xr-x. 1 root root  2791 jun 30  2020 utils.sh
This is how I have the commands defined in the nrpe.cfg file:

Code: Select all

command[check_crs_votedisk]=/usr/lib64/nagios/plugins/script_prueba.sh votedisk
command[check_crs_clusterstatus]=/usr/lib64/nagios/plugins/script_prueba.sh clusterstatus
command[check_crs_dbservicelocation]=/usr/lib64/nagios/plugins/script_prueba.sh dbservicelocation
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Communication error between Oracle and Nagios

Post by benjaminsmith »

Hi,

Thanks for checking that, it looks good. Can you share this script? I'd like to see if I can understand how it's checking the Cluster Synchronization Services daemon and if it might be taking too long when running from the Nagios server.
[OK] Voting disks status check succeeded - Unable to communicate with the Cluster Synchronization Services daemon
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked