Page 4 of 4
Re: check_nrpe is showing incorrect output
Posted: Fri Aug 13, 2021 10:38 am
by gsmith
Hi,
Odd that is complaining about ORACLE_HOME.
update the script on the Linux Oracle box, adding the last line:
Code: Select all
echo "sqlplus output:" > /tmp/dbout.log
sqlplus 2>&1 | tee -a /tmp/dbout.log
echo "/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus output:" >> /tmp/dbout.log
/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus 2>&1 | tee -a /tmp/dbout.log
echo "/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus perms:" >> /tmp/dbout.log
ls -l u01/app/grid/19.3.0/gridhome_1/bin/sqlplus 2>&1 | tee -a /tmp/dbout.log
env 2>&1 | tee -a /tmp/dbout.log
Run it via check_nrpe from the Nagios server and post the dbout.log file so
we can see how it's environment variables are set.
Thanks
Re: check_nrpe is showing incorrect output
Posted: Fri Aug 13, 2021 11:04 am
by Amit_Alone
Hi Smith,
I have updated the script as you mention. Below is the o/p from dbout.log
Code: Select all
[root@agasporap1855 tmp]# cat /tmp/dbout.log
sqlplus output:
/usr/local/nagios/libexec/check_dbconnection.sh: line 2: sqlplus: command not found
/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus output:
Error 6 initializing SQL*Plus
SP2-0667: Message file sp1<lang>.msb not found
SP2-0750: You may need to set ORACLE_HOME to your Oracle software directory
/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus perms:
-rwxr-xr-x. 1 root oinstall 24800 Sep 18 2020 u01/app/grid/19.3.0/gridhome_1/bin/sqlplus
XINETD_LANG=en_US
NRPE_PROGRAMVERSION=3.2.1
REMOTE_HOST=::ffff:10.110.143.26
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
PWD=/
LANG=en_US.UTF-8
SHLVL=2
EXTRAOPTIONS=
NRPE_MULTILINESUPPORT=1
_=/usr/bin/env
Below is the check_nrpe script run o/p.
Code: Select all
bash-4.2$ /usr/local/nagios/libexec/check_nrpe -H 10.110.3.55 -t 30 -c DBconnection_status
/usr/local/nagios/libexec/check_dbconnection.sh: line 2: sqlplus: command not found
Error 6 initializing SQL*Plus
SP2-0667: Message file sp1<lang>.msb not found
SP2-0750: You may need to set ORACLE_HOME to your Oracle software directory
-rwxr-xr-x. 1 root oinstall 24800 Sep 18 2020 u01/app/grid/19.3.0/gridhome_1/bin/sqlplus
XINETD_LANG=en_US
NRPE_PROGRAMVERSION=3.2.1
REMOTE_HOST=::ffff:10.110.143.26
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
PWD=/
LANG=en_US.UTF-8
SHLVL=2
EXTRAOPTIONS=
NRPE_MULTILINESUPPORT=1
_=/usr/bin/env
bash-4.2$
Re: check_nrpe is showing incorrect output
Posted: Fri Aug 13, 2021 2:20 pm
by gsmith
Hi,
I think we got it this time. Please add the three new lines to the script on Linux Oracle box:
Code: Select all
export ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1
export SQLPATH=/u01/app/grid/19.3.0/gridhome_1/bin
export PATH=$PATH:/u01/app/grid/19.3.0/gridhome_1/bin
echo "sqlplus output:" > /tmp/dbout.log
sqlplus 2>&1 | tee -a /tmp/dbout.log
echo "/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus output:" >> /tmp/dbout.log
/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus 2>&1 | tee -a /tmp/dbout.log
echo "/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus perms:" >> /tmp/dbout.log
ls -l u01/app/grid/19.3.0/gridhome_1/bin/sqlplus 2>&1 | tee -a /tmp/dbout.log
env 2>&1 | tee -a /tmp/dbout.log
Run it via check_nrpe from the Nagios server and post the dbout.log
Thank you
Re: check_nrpe is showing incorrect output
Posted: Sat Aug 14, 2021 9:30 am
by Amit_Alone
I had copied the shared script in dbconnection.sh file and then I run the same script under the nagios user on Linux oracle server and it work as expected, below is the dbout.log of the same.
Code: Select all
[root@agasporap1855 ~]# cat /tmp/dbout.log
sqlplus output:
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Aug 14 10:20:25 2021
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Enter user-name: /u01/app/grid/19.3.0/gridhome_1/bin/sqlplus output:
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Aug 14 10:20:55 2021
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Enter user-name:
Then as suggested I have run the same script through check_nrpe from nagiosxi server and observed the timeout.
Code: Select all
bash-4.2$ /usr/local/nagios/libexec/check_nrpe -H 10.110.3.55 -t 30 -c DBconnection_status
CHECK_NRPE STATE CRITICAL: Socket timeout after 30 seconds.
Below is the dbout.log
Code: Select all
[root@agasporap1855 ~]# cat /tmp/dbout.log
sqlplus output:
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Aug 14 10:23:31 2021
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Enter user-name: /u01/app/grid/19.3.0/gridhome_1/bin/sqlplus output:
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Aug 14 10:23:52 2021
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Enter user-name: /u01/app/grid/19.3.0/gridhome_1/bin/sqlplus perms:
-rwxr-xr-x. 1 root oinstall 24800 Sep 18 2020 u01/app/grid/19.3.0/gridhome_1/bin/sqlplus
XINETD_LANG=en_US
NRPE_PROGRAMVERSION=3.2.1
REMOTE_HOST=::ffff:XX.XXX.XXX.26
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/u01/app/grid/19.3.0/gridhome_1/bin
PWD=/
LANG=en_US.UTF-8
SQLPATH=/u01/app/grid/19.3.0/gridhome_1/bin
SHLVL=2
EXTRAOPTIONS=
ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1
NRPE_MULTILINESUPPORT=1
_=/usr/bin/env
/u01/app/grid/19.3.0/gridhome_1/bin/sqlplus output:
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Aug 14 10:24:01 2021
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Enter user-name: /u01/app/grid/19.3.0/gridhome_1/bin/sqlplus perms:
-rwxr-xr-x. 1 root oinstall 24800 Sep 18 2020 u01/app/grid/19.3.0/gridhome_1/bin/sqlplus
XINETD_LANG=en_US
NRPE_PROGRAMVERSION=3.2.1
REMOTE_HOST=::ffff:XX.XXX.XXX.26
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/u01/app/grid/19.3.0/gridhome_1/bin
PWD=/
LANG=en_US.UTF-8
SQLPATH=/u01/app/grid/19.3.0/gridhome_1/bin
SHLVL=2
EXTRAOPTIONS=
ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1
NRPE_MULTILINESUPPORT=1
_=/usr/bin/env
Thanks
Re: check_nrpe is showing incorrect output
Posted: Sat Aug 14, 2021 10:27 am
by Amit_Alone
I have just added the shared first 3 lines in the script and gave the try for check_nrpe from nagiosxi and to my surprise it worked as expected.
Code: Select all
[[email protected] ~]$ /usr/local/nagios/libexec/check_nrpe -H XX.XXX.X.55 -t 30 -c DBconnection_status
OK - Oracle DB connected Successfully
Thanks Smith. Speaking frankly I was losing hope that it will not get resolved. But you made it to work as expected.
However, just one last question in case DB connection failed alert will be raised as CRITICAL right. I can't perform the test as all those servers are Prod servers.
Thanks.
Re: check_nrpe is showing incorrect output
Posted: Mon Aug 16, 2021 9:17 am
by gsmith
Hi,
I knew we could get it to work
The return codes Nagios recognizes are:
0 for OK
1 for Warning
2 for Critical
3 for Unknown
You can "echo" anything you want.
So in your script change exit 2 to exit 1 and you'll just get a warning:
Code: Select all
if [ "$DBstatus" == 1 ]
then
echo "OK - Oracle DB connected Successfully"
exit 0
else
echo "CRITICAL - Oracle DB Connection Failed"
exit 1
fi
Let me know if I can lock this topic or if you have any more questions.
Thanks
Re: check_nrpe is showing incorrect output
Posted: Mon Aug 16, 2021 10:52 am
by Amit_Alone
Thanks Smith for your help. Yes, you can close this ticket.
Thanks & Regards,
Amit