Nagios eating up database resources on end point machine

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
mohan23
Posts: 118
Joined: Tue Oct 03, 2017 7:11 am

Nagios eating up database resources on end point machine

Post by mohan23 »

Hi Team,

We got some weird issue as nagios is eating up database resources by opening bulk open session in oracle database which caused database performance issue.

Issue: We are monitoring oracle database in nagios and nagios is polling database server to check metrics such as table spaces using "check_oracle_health" plugin and we are monitoring a query to check for Archive Log space in database and few other metrics as well. The thing is some times Nagios tends to run the same SQL queries(in our case nagios has 256 open sessions in database) in the database eating up database resources and impacting the database performance. We don't have any clue why this is happening. Can some one help me to get this fixed. Your help is utmost appreciated.
mohan23
Posts: 118
Joined: Tue Oct 03, 2017 7:11 am

Re: Nagios eating up database resources on end point machine

Post by mohan23 »

mohan23 wrote:Hi Team,

We got some weird issue as nagios is eating up database resources by opening bulk open session in oracle database which caused database performance issue.

Issue: We are monitoring oracle database in nagios and nagios is polling database server to check metrics such as table spaces using "check_oracle_health" plugin and we are monitoring a query to check for Archive Log space in database and few other metrics as well. The thing is some times Nagios tends to run the same SQL queries(in our case nagios has 256 open sessions in database) in the database eating up database resources and impacting the database performance. We don't have any clue why this is happening. Can some one help me to get this fixed. Your help is utmost appreciated.
I have attached screenshot that shows nagios server has open sessions over 250 in database.
You do not have the required permissions to view the files attached to this post.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Nagios eating up database resources on end point machine

Post by tmcdonald »

I'd like to get some more information:
  • How many total checks do you have set up against this Oracle database?
  • Are these checks all OK? All Critical? Timing out? Or some mix?
  • How long are these sessions staying open?
  • How quickly does the number of open sessions increase?
Former Nagios employee
mohan23
Posts: 118
Joined: Tue Oct 03, 2017 7:11 am

Re: Nagios eating up database resources on end point machine

Post by mohan23 »

Below is the information that you have requested.

•How many total checks do you have set up against this Oracle database? we have 15 checks setup against this oracle database
•Are these checks all OK? All Critical? Timing out? Or some mix? All are OK
•How long are these sessions staying open? This has happened for first time last week and they stay open for more than 6 hours
•How quickly does the number of open sessions increase? After Killing these 250+ sessions, we have seen another 20+ sessions are opened in next 15 mins.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios eating up database resources on end point machine

Post by tgriep »

Can you run the following to get the version number of the plugin?

Code: Select all

/usr/local/nagios/libexec/check_oracle_health -V
If you are not running version 3.1.2.2 of the plugin, can you update it on the server?

The latest version can be found at this link.
https://labs.consol.de/nagios/check_ora ... l#download

Let us know if this helps out on the connection not dropping.
Be sure to check out our Knowledgebase for helpful articles and solutions!
mohan23
Posts: 118
Joined: Tue Oct 03, 2017 7:11 am

Re: Nagios eating up database resources on end point machine

Post by mohan23 »

I have followed steps available in README file to upgrade oracle client version to 3.1.2.2

]$ /usr/local/nagios/libexec/check_oracle_health -V
check_oracle_health (3.1.2.2)
This nagios plugin comes with ABSOLUTELY NO WARRANTY. You may redistribute
copies of this plugin under the terms of the GNU General Public License.

I'm able to run the command from the server and im getting correct output and when I check this from Nagios web console I see below error.

CRITICAL - cannot connect to lxoraapexprf001:1521/APEXPRF1. install_driver(Oracle) failed: Can't load '/usr/local/lib64/perl5/auto/DBD/Oracle/Oracle.so' for module DBD::Oracle: libocci.so.12.1: cannot open shared object file: No such file or directory at
at (eval 13) line 3.
Compilation failed in require at (eval 13) line 3.
Perhaps a required shared library or dll isn't installed where expected
at /usr/local/nagios/libexec/check_oracle_health line 6151.

Can you lemme know what is this new package that I need to install.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios eating up database resources on end point machine

Post by tgriep »

Take a look at this document for Installing the Oracle Instant Client on the Nagios server which the check_oracle_health plugin requires.
https://assets.nagios.com/downloads/nag ... lation.pdf

If as the steps needed to install it and how to troubleshoot the error that you are receiving.
Most of the time when you get that error, you need to adjust the path for the Oracle environment and the steps to do that start on Page 4.
Try it out and if you still have any errors, post the error message and how the command is setup in Nagios.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked