Error in Mongo DB Replication_Lag_Percentage monitoring

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
lrnnetops
Posts: 102
Joined: Thu May 18, 2017 5:31 am

Error in Mongo DB Replication_Lag_Percentage monitoring

Post by lrnnetops »

Hi Team,

Nagios XI = 5.6.5
Python Version = 2.7.5
pymongo = 3.8.0
Mongo DB server Version = v3.6.5

We are monitoring our Mongo DB servers which are in replica-set (One server host individual DB instance). For monitoring "Replication Lag Percentage" we have issues on all the secondary nodes.
mongodb-lag-percentage.jpg
From above image we can see the see error we getting on the secondary servers. also for "drxmdbamlmog03" we are getting error for connectivity but its allowed.
prdmdbamlmog02-connectivity.jpg
Found below link for the issues faced by other people. can you check above two issues & provide the fix on it.

https://github.com/mzupan/nagios-plugin ... issues/232

Regards,
Rohan
You do not have the required permissions to view the files attached to this post.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by mbellerue »

It would be good to see the commands being run here. Maybe we should just go right to getting the system profile. In Nagios XI go to Admin -> System Profile -> click the Download button. Just PM that over to me, and I will check it out.

Other questions. Was this ever working, or is this a new service check? If it was working, were there any changes recently either to Nagios XI or the MongoDB servers that may have affected the service check?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
lrnnetops
Posts: 102
Joined: Thu May 18, 2017 5:31 am

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by lrnnetops »

HI mbellerue,

For security reason we wont be able to share complete requested "System Profile" details. Can you be more specific with files from it you want?

Below answers to your queries.

Q - Was this ever working, or is this a new service check?
A - No, We first time added mongo DB in monitoring on Nagios XI server

Regards,
Rohan
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by mbellerue »

The most important items I can think of right away will be the service and command definitions. That will be the best place to start troubleshooting this.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by lmiltchev »

Just to add to what @mbellerue said - there is pull request on github that according to kagahd "solves this issue".

https://github.com/mzupan/nagios-plugin ... e274071b7b

This may be a bit different for the plugin we are using, but you could try modifying the plugin and testing the "patched" one to see if this is going to resolve your problem.

1. Make a copy of the "original" plugin:

Code: Select all

cd /usr/local/nagios/libexec
cp -p check_mongodb.py check_mongodb.py.orig
2. Open the check_mongodb.py in a text editor, e.g. vi, and change this line (around line 229):

Code: Select all

return check_rep_lag(con, host, warning, critical, True, perf_data, max_lag, user, passwd, ssl, insecure, cert_file)
to this:

Code: Select all

return check_rep_lag(con, host, warning, critical, True, perf_data, max_lag, user, passwd, authdb, ssl, insecure, cert_file)
this (around line 412):

Code: Select all

def check_rep_lag(con, host, warning, critical, percent, perf_data, max_lag, user, passwd, ssl=None, insecure=None, cert_file=None):
to this:

Code: Select all

def check_rep_lag(con, host, warning, critical, percent, perf_data, max_lag, user, passwd, authdb="admin", ssl=None, insecure=None, cert_file=None):
and this (around line 489):

Code: Select all

err, con = mongo_connect(primary_node['name'].split(':')[0], int(primary_node['name'].split(':')[1]), False, user, passwd)
to this:

Code: Select all

err, con = mongo_connect(primary_node['name'].split(':')[0], int(primary_node['name'].split(':')[1]), ssl, user, passwd, None, authdb, cert_file)
Save, exit and try your check from the command line.

Example:

Code: Select all

/usr/local/nagios/libexec/check_mongodb.py -H <ip address> -A replication_lag_percent -P 27017 -W 50 -C 75 -u <username> -p <password> -D --all-databases
If this doesn't work, you can always replace your modified plugin with the "original":

Code: Select all

cd /usr/local/nagios/libexec
mv check_mongodb.py.orig check_mongodb.py
Let us know if this helped.
Be sure to check out our Knowledgebase for helpful articles and solutions!
lrnnetops
Posts: 102
Joined: Thu May 18, 2017 5:31 am

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by lrnnetops »

HI lmiltchev / mbellerue,

We tested the provided suggestion by editing "check_mongodb.py", but still issue not resolved. its still showing same error on UI as earlier.

Also tried execution from CLI & same output received.
cli-output.jpg
Attaching modified "
check_mongodb.py
" please have a look if anything missed to add by us.

Regards,
Rohan
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by lmiltchev »

You haven't missed anything. The patch should've worked it the problem was caused by authentication issues (which may not be the case). Try the following:

1. Grab the latest plugin from here:
https://github.com/mzupan/nagios-plugin-mongodb
if you haven't already, and see if it's going to fix the issue.

2. Make sure your pymongo package is up to date:

Code: Select all

pip install --upgrade pymongo
If this doesn't work, let us know what document/guide/tutorial you followed to set up the mongodb replication. Give us as many details as possible, e.g. number of replica set members, authentication mechanisms, etc. We will need to lab this in-house in order to further troubleshoot the issue, and test a possible patch to the plugin.
Be sure to check out our Knowledgebase for helpful articles and solutions!
lrnnetops
Posts: 102
Joined: Thu May 18, 2017 5:31 am

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by lrnnetops »

Hi lmiltchev,

We tried mongodb monitoring with new plugin from the provided github link.
plugin-download-verification.jpg
We moved existing / old plugin to backup directory & copied new plugin to "/usr/local/nagios/libexec/". changed ownership & checked md5sum.
plugin-copied-to-libexec-verified.jpg
After copying new plugin some of the checks were had issue with host resolution. so we rollback to existing plugin & the host resolution issue resolved.
issue-with-new-plugin.jpg
With current working plugin we are unable to confirm the version of it as it does not support versioning details. we did not download "check_mongodb.py" from outside links. At the time of nagios X upgrade from 5.5.x to 5.6.x all the latest plugins were updated.

We will share the requested details related to our mongo DB replication.

Regards,
Rohan
You do not have the required permissions to view the files attached to this post.
lrnnetops
Posts: 102
Joined: Thu May 18, 2017 5:31 am

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by lrnnetops »

Hi lmiltchev,

Adding pymongo version details.
pymongo-version.jpg
We also notice now that after upgrading pymongo to 3.9.0 our below check is not working with new plugin as well as old plugin.
checks-issue-after-pymongo-upgrade.jpg
Please can you suggest how to fix it.
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Post by lmiltchev »

We will share the requested details related to our mongo DB replication.
We will be waiting for the info - whenever you have time.
After copying new plugin some of the checks were had issue with host resolution. so we rollback to existing plugin & the host resolution issue resolved.
I am not sure why this would be but keep using our plugin for the time being. I will check with our developers on that.
We also notice now that after upgrading pymongo to 3.9.0 our below check is not working with new plugin as well as old plugin.

Please can you suggest how to fix it.
What was the "old" version of pymongo that you upgraded from? You can revert back to it by running:

Code: Select all

pip install --upgrade pymongo==<version>
Example:

Code: Select all

[root@main-nagios-xi libexec]# pip freeze | grep pymongo
You are using pip version 8.1.2, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
pymongo==3.9.0
[root@main-nagios-xi libexec]# pip install --upgrade pymongo==3.8.0
Collecting pymongo==3.8.0
  Using cached https://files.pythonhosted.org/packages/6e/ea/353a9f8ced71b9e258438aa4d6aa048c1eaf5b774d3ef2273d686fde0947/pymongo-3.8.0-cp27-cp27mu-manylinux1_x86_64.whl
Installing collected packages: pymongo
  Found existing installation: pymongo 3.9.0
    Uninstalling pymongo-3.9.0:
      Successfully uninstalled pymongo-3.9.0
Successfully installed pymongo-3.8.0
You are using pip version 8.1.2, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[root@main-nagios-xi libexec]# pip freeze | grep pymongo
You are using pip version 8.1.2, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
pymongo==3.8.0
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked