Page 1 of 2

Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Wed Aug 21, 2019 4:14 pm
by lrnnetops
Hi Team,

Nagios XI = 5.6.5
Python Version = 2.7.5
pymongo = 3.8.0
Mongo DB server Version = v3.6.5

We are monitoring our Mongo DB servers which are in replica-set (One server host individual DB instance). For monitoring "Replication Lag Percentage" we have issues on all the secondary nodes.
mongodb-lag-percentage.jpg
From above image we can see the see error we getting on the secondary servers. also for "drxmdbamlmog03" we are getting error for connectivity but its allowed.
prdmdbamlmog02-connectivity.jpg
Found below link for the issues faced by other people. can you check above two issues & provide the fix on it.

https://github.com/mzupan/nagios-plugin ... issues/232

Regards,
Rohan

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Thu Aug 22, 2019 12:05 pm
by mbellerue
It would be good to see the commands being run here. Maybe we should just go right to getting the system profile. In Nagios XI go to Admin -> System Profile -> click the Download button. Just PM that over to me, and I will check it out.

Other questions. Was this ever working, or is this a new service check? If it was working, were there any changes recently either to Nagios XI or the MongoDB servers that may have affected the service check?

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Thu Aug 22, 2019 2:59 pm
by lrnnetops
HI mbellerue,

For security reason we wont be able to share complete requested "System Profile" details. Can you be more specific with files from it you want?

Below answers to your queries.

Q - Was this ever working, or is this a new service check?
A - No, We first time added mongo DB in monitoring on Nagios XI server

Regards,
Rohan

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Fri Aug 23, 2019 9:22 am
by mbellerue
The most important items I can think of right away will be the service and command definitions. That will be the best place to start troubleshooting this.

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Fri Aug 23, 2019 9:57 am
by lmiltchev
Just to add to what @mbellerue said - there is pull request on github that according to kagahd "solves this issue".

https://github.com/mzupan/nagios-plugin ... e274071b7b

This may be a bit different for the plugin we are using, but you could try modifying the plugin and testing the "patched" one to see if this is going to resolve your problem.

1. Make a copy of the "original" plugin:

Code: Select all

cd /usr/local/nagios/libexec
cp -p check_mongodb.py check_mongodb.py.orig
2. Open the check_mongodb.py in a text editor, e.g. vi, and change this line (around line 229):

Code: Select all

return check_rep_lag(con, host, warning, critical, True, perf_data, max_lag, user, passwd, ssl, insecure, cert_file)
to this:

Code: Select all

return check_rep_lag(con, host, warning, critical, True, perf_data, max_lag, user, passwd, authdb, ssl, insecure, cert_file)
this (around line 412):

Code: Select all

def check_rep_lag(con, host, warning, critical, percent, perf_data, max_lag, user, passwd, ssl=None, insecure=None, cert_file=None):
to this:

Code: Select all

def check_rep_lag(con, host, warning, critical, percent, perf_data, max_lag, user, passwd, authdb="admin", ssl=None, insecure=None, cert_file=None):
and this (around line 489):

Code: Select all

err, con = mongo_connect(primary_node['name'].split(':')[0], int(primary_node['name'].split(':')[1]), False, user, passwd)
to this:

Code: Select all

err, con = mongo_connect(primary_node['name'].split(':')[0], int(primary_node['name'].split(':')[1]), ssl, user, passwd, None, authdb, cert_file)
Save, exit and try your check from the command line.

Example:

Code: Select all

/usr/local/nagios/libexec/check_mongodb.py -H <ip address> -A replication_lag_percent -P 27017 -W 50 -C 75 -u <username> -p <password> -D --all-databases
If this doesn't work, you can always replace your modified plugin with the "original":

Code: Select all

cd /usr/local/nagios/libexec
mv check_mongodb.py.orig check_mongodb.py
Let us know if this helped.

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Fri Aug 23, 2019 5:10 pm
by lrnnetops
HI lmiltchev / mbellerue,

We tested the provided suggestion by editing "check_mongodb.py", but still issue not resolved. its still showing same error on UI as earlier.

Also tried execution from CLI & same output received.
cli-output.jpg
Attaching modified "
check_mongodb.py
" please have a look if anything missed to add by us.

Regards,
Rohan

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Mon Aug 26, 2019 9:53 am
by lmiltchev
You haven't missed anything. The patch should've worked it the problem was caused by authentication issues (which may not be the case). Try the following:

1. Grab the latest plugin from here:
https://github.com/mzupan/nagios-plugin-mongodb
if you haven't already, and see if it's going to fix the issue.

2. Make sure your pymongo package is up to date:

Code: Select all

pip install --upgrade pymongo
If this doesn't work, let us know what document/guide/tutorial you followed to set up the mongodb replication. Give us as many details as possible, e.g. number of replica set members, authentication mechanisms, etc. We will need to lab this in-house in order to further troubleshoot the issue, and test a possible patch to the plugin.

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Tue Aug 27, 2019 6:06 pm
by lrnnetops
Hi lmiltchev,

We tried mongodb monitoring with new plugin from the provided github link.
plugin-download-verification.jpg
We moved existing / old plugin to backup directory & copied new plugin to "/usr/local/nagios/libexec/". changed ownership & checked md5sum.
plugin-copied-to-libexec-verified.jpg
After copying new plugin some of the checks were had issue with host resolution. so we rollback to existing plugin & the host resolution issue resolved.
issue-with-new-plugin.jpg
With current working plugin we are unable to confirm the version of it as it does not support versioning details. we did not download "check_mongodb.py" from outside links. At the time of nagios X upgrade from 5.5.x to 5.6.x all the latest plugins were updated.

We will share the requested details related to our mongo DB replication.

Regards,
Rohan

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Tue Aug 27, 2019 6:26 pm
by lrnnetops
Hi lmiltchev,

Adding pymongo version details.
pymongo-version.jpg
We also notice now that after upgrading pymongo to 3.9.0 our below check is not working with new plugin as well as old plugin.
checks-issue-after-pymongo-upgrade.jpg
Please can you suggest how to fix it.

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Wed Aug 28, 2019 9:20 am
by lmiltchev
We will share the requested details related to our mongo DB replication.
We will be waiting for the info - whenever you have time.
After copying new plugin some of the checks were had issue with host resolution. so we rollback to existing plugin & the host resolution issue resolved.
I am not sure why this would be but keep using our plugin for the time being. I will check with our developers on that.
We also notice now that after upgrading pymongo to 3.9.0 our below check is not working with new plugin as well as old plugin.

Please can you suggest how to fix it.
What was the "old" version of pymongo that you upgraded from? You can revert back to it by running:

Code: Select all

pip install --upgrade pymongo==<version>
Example:

Code: Select all

[root@main-nagios-xi libexec]# pip freeze | grep pymongo
You are using pip version 8.1.2, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
pymongo==3.9.0
[root@main-nagios-xi libexec]# pip install --upgrade pymongo==3.8.0
Collecting pymongo==3.8.0
  Using cached https://files.pythonhosted.org/packages/6e/ea/353a9f8ced71b9e258438aa4d6aa048c1eaf5b774d3ef2273d686fde0947/pymongo-3.8.0-cp27-cp27mu-manylinux1_x86_64.whl
Installing collected packages: pymongo
  Found existing installation: pymongo 3.9.0
    Uninstalling pymongo-3.9.0:
      Successfully uninstalled pymongo-3.9.0
Successfully installed pymongo-3.8.0
You are using pip version 8.1.2, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[root@main-nagios-xi libexec]# pip freeze | grep pymongo
You are using pip version 8.1.2, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
pymongo==3.8.0