Page 2 of 2

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Thu Aug 29, 2019 4:47 pm
by lrnnetops
Hi lmiltchev,

Issues we found after upgrading pymongo has been fixed after downgrading to older version (3.8.0). Now we have lag percentage check issue which is still not resolved.

Please find details you requested.

Q. What document/guide/tutorial you followed to set up the mongodb replication.
A. We used documentation for setting up replication from https://docs.mongodb.com for the version we using.

Q. Number of replica set members
A. Each replica set have 3 members.

Q. Authentication mechanisms
A. For mongo shell login to DB we use username & password. Internally server communication happens with key-based.

# Additional Setup Details.

1 - Infra Hosted on AWS EC2 instance.
2 - We have three server in replication.
3 - Each server hosts two different mogo DB instance.

Attaching DB config file we using for the hosted DB's. hostname & IP's changes according to server rest settings are same.
db-config-file.txt

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Fri Aug 30, 2019 12:58 pm
by lmiltchev
Thank you for the detailed information! We will try to recreate the issue in-house, and will get back to you on the forum.

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Fri Aug 30, 2019 2:44 pm
by lmiltchev
Follow up:
We are trying to understand how your replica is set up.

You are binding to the same IP address in DB1 and DB2. Do you have these replicas running on the same server (different ports) with data being in different directories (/data/certman/ & /data/catcon/)?

Also, isn't the replSetName supposed to be the same? You have "cmxMongoReplica" and "ccaMongoReplica"... Can you elaborate?

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Tue Sep 10, 2019 1:04 pm
by lrnnetops
Hi,

Q - You are binding to the same IP address in DB1 and DB2.
A - Yes. Three servers running with same DB instance which are in replica-set.

Q - Do you have these replicas running on the same server (different ports) with data being in different directories (/data/certman/ & /data/catcon/)?
A - Yes.

Q - isn't the replSetName supposed to be the same? You have "cmxMongoReplica" and "ccaMongoReplica"... Can you elaborate?
A - As mentioned we are running two difference DB instances on each server so we have two different replica set names,

Regards,
Rohan

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Wed Sep 11, 2019 10:28 am
by lmiltchev
Thank you for the info! I was able to recreate the issue in-house. Just wanted to give you heads up - our developers are looking into this. I will get back to you on the forum as soon as I hear back from them. Thanks!

Re: Error in Mongo DB Replication_Lag_Percentage monitoring

Posted: Wed Sep 11, 2019 10:44 am
by lmiltchev
Try the following:

1. Make a backup of your original plugin:

Code: Select all

cp -p /usr/local/nagios/libexec/check_mongodb.py /usr/local/nagios/libexec/check_mongodb.py.backup
2. Download the zip file below:
check_mongodb.zip
unzip it, and copy over the plugin to the "/usr/local/nagios/libexec/" directory, overwriting the original plugin.

3. Test the "patched" plugin:

Code: Select all

/usr/local/nagios/libexec/check_mongodb.py -H <primary> -A replication_lag_percent -P 27027 -W 50 -C 75-u username-p password -D --all-databases
/usr/local/nagios/libexec/check_mongodb.py -H <replica 1> -A replication_lag_percent -P 27027 -W 50-C 75-u username -p password -D --all-databases
/usr/local/nagios/libexec/check_mongodb.py -H <replica 2> -A replication_lag_percent -P 27027 -W 50-C 75-u username -p password -D --all-databases