Page 1 of 4

check_nrpe is showing incorrect output

Posted: Thu Aug 05, 2021 7:48 am
by Amit_Alone
Hi,

I have created the bash script to check the oracle connectivity. However, script is running as expected on Linux oracle server but on nagios xi after configuring output is showing different.

Below is common.cfg data

Code: Select all

command[DBconnection_status]=/usr/local/nagios/libexec/check_dbconnection.sh
I tried using sudo as well but still it didn't work.

Below o/p is from Linux oracle server

Code: Select all

[root@agasporap1855 nrpe]# /usr/local/nagios/libexec/check_dbconnection.sh
OK - Oracle DB connected Successfully
But on Nagios xi server I'm observing opposite o/p

Code: Select all

[[email protected] ~]$ /usr/local/nagios/libexec/check_nrpe -H XX.XX.XX.XX -t 30 -c DBconnection_status
CRITICAL - Oracle DB Connection Failed
NRPE version on Oracle

Code: Select all

[root@agasporap1855 nrpe]# /usr/local/nagios/libexec/check_nrpe -V
NRPE Plugin for Nagios
Version: 3.2.1
I refer the forum which are observer the same issue but none of them able to solve problem.

Re: check_nrpe is showing incorrect output

Posted: Thu Aug 05, 2021 10:31 am
by gsmith
Hi

Can you connect to the Oracle DB from any other remote computers?

Is the Oracle server configured (via firewall, etc) to limit connections from specific hosts?

Thanks

Re: check_nrpe is showing incorrect output

Posted: Thu Aug 05, 2021 11:00 am
by Amit_Alone
Can you connect to the Oracle DB from any other remote computers?
I tried the below command from different Nagios xi servers and still displaying the incorrect o/p

Code: Select all

[root@AVGDLNXVP106 ~]# /usr/local/nagios/libexec/check_nrpe -H XX.XX.XX.XX -c DBconnection_status
CRITICAL - Oracle DB Connection Failed
Below is the string to connect the DB

Code: Select all

sqlplus  "USERNAME/PASSWORD@(DESCRIPTION_LIST=(LOAD_BALANCE=off)(FAILOVER=on)(DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS_LIST=(LOAD_BALANCE=on)(ADDRESS=(PROTOCOL=TCP)(HOST=XXXXXX-prod-scan.agasp.ad)(PORT=XXXX)))(CONNECT_DATA=(SERVICE_NAME=XXXXX)))(DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS_LIST=(LOAD_BALANCE=on)(ADDRESS=(PROTOCOL=TCP)(HOST=XXXX-dr-scan.agasp.ad)(PORT=XXX)))(CONNECT_DATA=(SERVICE_NAME=XXXXX))))"
Is the Oracle server configured (via firewall, etc) to limit connections from specific hosts?
Not sure about limitation. However, I can see baseline services are working as expected. Attaching the screen shot.

Re: check_nrpe is showing incorrect output

Posted: Thu Aug 05, 2021 11:06 am
by gsmith
Hi

Since your command works on the Oracle server but not the remote servers I suspect it is a
network or Oracle configuration causing the command to fail remotely. Talk to your
friendly DBA and tell him you are trying to make a connection to the db from the
Nagios server, and is there anything he (the dba) has to do to allow the connection.

Thanks

Re: check_nrpe is showing incorrect output

Posted: Thu Aug 05, 2021 12:09 pm
by Amit_Alone
Sorry for confusion, let me explain again we do have Linux server on which we have install the Oracle DB. Now on the same server command is working as expected for connecting the oracle DB using sqlplus cmd.

However, when the same command is executed through the check_nrpe plugin on the remote server it doesn't work and this is how it works correct "check_nrpe plugin contacts the NRPE daemon on the remote host and then NRPE daemon runs the appropriate Nagios plugin to check the service or resource".

Also, I tried to execute the command to check_nrpe on the remote host server and it display me the error

Code: Select all

[root@agasporap1855 libexec]# /usr/local/nagios/libexec/check_nrpe -H localhost
CHECK_NRPE: Error - Could not connect to ::: Connection reset by peer
Tried on another test Linux remote server and it provide expected o/p

Code: Select all

[root@AGASPLNXVT1713 ~]# /usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v3.2.1
So, is there is any issue with nrpe on the remote server

Re: check_nrpe is showing incorrect output

Posted: Thu Aug 05, 2021 2:26 pm
by gsmith
Hi,

Ah - ok got it. On the remote machine can you make sure port 5666/tcp is open as this
is what NRPE uses. here are two ways to check, run them on the XI server:

Code: Select all

nmap -p 5666 <ip of remote server>  
many times nmap is not installed, so you can install it or try this:

Code: Select all

curl -v telnet://<ip of remote server>:5666
Let me know what you find out please.

Thanks

Re: check_nrpe is showing incorrect output

Posted: Fri Aug 06, 2021 5:15 am
by Amit_Alone
Hi,

Below is the o/p for the shared command. I run the those command on NagiosXI server.

Code: Select all

bash-4.2$ nmap -p 5666 XX.XX.XX.55

Starting Nmap 6.47 ( http://nmap.org ) at 2021-08-06 10:08 UTC
Nmap scan report for XX.XX.XX.55
Host is up (0.15s latency).
PORT     STATE SERVICE
5666/tcp open  nrpe

Nmap done: 1 IP address (1 host up) scanned in 0.29 seconds

Code: Select all

bash-4.2$ curl -v telnet://XX.XX.XX.55:5666
* About to connect() to XX.XX.XX.55 port 5666 (#0)
*   Trying XX.XX.XX.55...
* Connected to XX.XX.XX.55 (XX.XX.XX.55) port 5666 (#0)
^C
bash-4.2$
Thanks.

Re: check_nrpe is showing incorrect output

Posted: Fri Aug 06, 2021 10:48 am
by gsmith
Hey,

That looks good!

can you run this on the remote machine (the one that has Oracle on it):

Code: Select all

cat /etc/xinetd.d/nrpe
And send the output to me. You can either paste the output in your reply, or if
security is a concern you can PM the output to me. If you decide to send it via
a PM then please reply to this post once you have sent the pm.

Thanks

Re: check_nrpe is showing incorrect output

Posted: Fri Aug 06, 2021 11:02 am
by Amit_Alone
Hey Smith,

I have PM the requested details.

Thanks.

Re: check_nrpe is showing incorrect output

Posted: Fri Aug 06, 2021 11:50 am
by gsmith
Hi

can you edit /etc/xinetd.d/nrpe and in this line:

Code: Select all

only_from       = 127.0.0.1 XXX.XXX.XXX.0/26
replace the XXX.XXX.XXX.0/26 with the IP of the Nagios server.

Then run:

Code: Select all

systemctl restart xinetd.service
and test please.

Thanks