I am in a process of migrating from the local MySQL db to the external MySQL dbs running on a Linux cluster. Previously, the migration happened fine in my dev and preprod environments. In my prod environment when I reconfigured both of my Nagios servers, primary and secondary, to the external db Nagios would not start running checks. Everything else looked fine--db connectivity, XI System Component Status icons all green but Process State icon was red.
We observed the same behavior on the primary and also on the secondary backup server after the failover to it. There is not much in the logs--messages and nagios.lo are attached.
After pointing to the external db the commands 'service nagios stop' and 'service nagios start' would give a lot of errors. See below
Stopping ndo2db: done.
-bash-4.1$ service nagios stop
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cbRm8uh.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cbRm8uh': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c49TLGi.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c49TLGi': Operation not permitted
Stopping nagios:.. done.
-bash-4.1$ service ndo2db start
Starting ndo2db: done.
-bash-4.1$ service nagios start
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/clZfAsp.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/clZfAsp': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c1MzEw4.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c1MzEw4': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cQbZGCt.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cQbZGCt': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cw2VIrt.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cw2VIrt': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cJ2Zhg1.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cJ2Zhg1': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cml6N6J.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cml6N6J': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cRu0DCv.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cRu0DCv': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cof5T98.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cof5T98': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cop96q6.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cop96q6': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cTy8McM.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cTy8McM': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cnm9KtV.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cnm9KtV': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/co35GOb.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/co35GOb': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c04Mi7b.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c04Mi7b': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cWYXpZJ.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cWYXpZJ': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cTQFuVs.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cTQFuVs': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cx3wuze.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cx3wuze': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cXfnJqS.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cXfnJqS': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cEkO9QP.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cEkO9QP': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c3ALXKv.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c3ALXKv': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cdOKrsV.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cdOKrsV': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cRG4lQV.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cRG4lQV': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cadMVIt.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cadMVIt': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cDuhHHc.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cDuhHHc': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cTZ5OyY.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cTZ5OyY': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cEIv2tC.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cEIv2tC': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cXKWc9z.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cXKWc9z': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/chQdw4f.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/chQdw4f': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c30fl6E.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c30fl6E': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cKrHBRF.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cKrHBRF': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cI1l6vG.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cI1l6vG': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cCSqOve.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cCSqOve': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cWHqjHX.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cWHqjHX': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c2yioAJ.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c2yioAJ': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c2YIxNn.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c2YIxNn': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c46VjQl.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c46VjQl': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cR44TY1.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cR44TY1': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cu9MU8q.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cu9MU8q': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cDJJV6r.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cDJJV6r': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cjn1Oot.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cjn1Oot': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cyaiFG1.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cyaiFG1': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c7XXUcL.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c7XXUcL': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cERmLZb.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cERmLZb': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cxLPepb.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cxLPepb': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cG0p9TR.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cG0p9TR': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cbRm8uh.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/cbRm8uh': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c49TLGi.ok': Operation not permitted
chown: changing ownership of `/var/nagiosramdisk/spool/checkresults/c49TLGi': Operation not permitted
Starting nagios: done.
-bash-4.1$
After we failed back to the local db everything worked fine and no errors from 'service nagios stop/start'
How can we troubleshoot this? Not much to go by.
Nagios does not perform checks after migrating to external M
-
dwasswa
Re: Nagios does not perform checks after migrating to extern
Hi @corkyman,
After looking at your logs, I suggest that you check your permissions.
1. Check the permissions in /var/nagiosramdisk/spool/checkresults by running the following commands:
ls -al /var/nagiosramdisk/spool/checkresults/
ls -al /var/nagiosramdisk/spool/
2. Check if your Nagios account is not expired by running the following command:
chage -l nagios
Please verify that information and let me know if you have any questions.
After looking at your logs, I suggest that you check your permissions.
1. Check the permissions in /var/nagiosramdisk/spool/checkresults by running the following commands:
ls -al /var/nagiosramdisk/spool/checkresults/
ls -al /var/nagiosramdisk/spool/
2. Check if your Nagios account is not expired by running the following command:
chage -l nagios
Please verify that information and let me know if you have any questions.
- tacolover101
- Posts: 432
- Joined: Mon Apr 10, 2017 11:55 am
Re: Nagios does not perform checks after migrating to extern
the above is specific to ramdisk which could still be related to this.
if databases are not communicating properly, generally it is due to time before off between the systems / sql.
if databases are not communicating properly, generally it is due to time before off between the systems / sql.
Re: Nagios does not perform checks after migrating to extern
Here is the output of suggested commands:
[c601018@vhlgnngxi002 ~]$ sudo su - nagios
-bash-4.1$ ls -al /var/nagiosramdisk/spool/checkresults/
total 0
drwxrwxr-x 2 nagios nagios 40 Nov 12 22:50 .
drwxrwxr-x 5 nagios nagios 100 Sep 20 17:06 ..
-bash-4.1$ ls -al /var/nagiosramdisk/spool/
total 0
drwxrwxr-x 5 nagios nagios 100 Sep 20 17:06 .
drwxrwxrwt 6 nagios nagios 200 Nov 13 08:19 ..
drwxrwxr-x 2 nagios nagios 40 Nov 12 22:50 checkresults
drwxrwxr-x 2 nagios nagios 40 Sep 20 17:06 perfdata
drwxrwxr-x 2 nagios nagios 40 Sep 20 17:06 xidpe
-bash-4.1$ chage -l nagios
Last password change : Feb 23, 2016
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
-bash-4.1$
I do not understand this statement "if databases are not communicating properly, generally it is due to time before off between the systems / sql.". Could you elaborate and let me know how would I check the "proper" communication to the db?
I need to pinpoint/isolate the problem before actually trying to reconfigure again--this is a production system and requires scheduling and change control with a lot of eyes on unsuccessful activities that would give Nagios a bad name.
I am verifying connectivity with these commands
mysqladmin --host=mynagiosxidb.pp.tvlport.com --user=nagiosxi --password=n@gweb --socket=/mynagiosxi/data/mysql/mysql_nagiosxi.sock --port=3382 ping
mysqladmin --host=mynagiosxidb.pp.tvlport.com --user=nagiosql --password=n@gweb --socket=/mynagiosxi/data/mysql/mysql_nagiosxi.sock --port=3382 ping
mysqladmin --host=mynagiosxidb.pp.tvlport.com --user=ndoutils --password=n@gweb --socket=/mynagiosxi/data/mysql/mysql_nagiosxi.sock --port=3382 ping
echo 'use nagios; SHOW TABLES;' | mysql -undoutils -pn@gweb -hmynagiosxidb.pp.tvlport.com -P3382
echo 'use nagiosql; SHOW TABLES;' | mysql -unagiosql -pn@gweb -hmynagiosxidb.pp.tvlport.com -P3382
echo 'use nagiosxi; select * from `xi_sysstat` limit 10;' | mysql -unagiosxi -pn@gweb -hmynagiosxidb.pp.tvlport.com -P3382
[c601018@vhlgnngxi002 ~]$ sudo su - nagios
-bash-4.1$ ls -al /var/nagiosramdisk/spool/checkresults/
total 0
drwxrwxr-x 2 nagios nagios 40 Nov 12 22:50 .
drwxrwxr-x 5 nagios nagios 100 Sep 20 17:06 ..
-bash-4.1$ ls -al /var/nagiosramdisk/spool/
total 0
drwxrwxr-x 5 nagios nagios 100 Sep 20 17:06 .
drwxrwxrwt 6 nagios nagios 200 Nov 13 08:19 ..
drwxrwxr-x 2 nagios nagios 40 Nov 12 22:50 checkresults
drwxrwxr-x 2 nagios nagios 40 Sep 20 17:06 perfdata
drwxrwxr-x 2 nagios nagios 40 Sep 20 17:06 xidpe
-bash-4.1$ chage -l nagios
Last password change : Feb 23, 2016
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
-bash-4.1$
I do not understand this statement "if databases are not communicating properly, generally it is due to time before off between the systems / sql.". Could you elaborate and let me know how would I check the "proper" communication to the db?
I need to pinpoint/isolate the problem before actually trying to reconfigure again--this is a production system and requires scheduling and change control with a lot of eyes on unsuccessful activities that would give Nagios a bad name.
I am verifying connectivity with these commands
mysqladmin --host=mynagiosxidb.pp.tvlport.com --user=nagiosxi --password=n@gweb --socket=/mynagiosxi/data/mysql/mysql_nagiosxi.sock --port=3382 ping
mysqladmin --host=mynagiosxidb.pp.tvlport.com --user=nagiosql --password=n@gweb --socket=/mynagiosxi/data/mysql/mysql_nagiosxi.sock --port=3382 ping
mysqladmin --host=mynagiosxidb.pp.tvlport.com --user=ndoutils --password=n@gweb --socket=/mynagiosxi/data/mysql/mysql_nagiosxi.sock --port=3382 ping
echo 'use nagios; SHOW TABLES;' | mysql -undoutils -pn@gweb -hmynagiosxidb.pp.tvlport.com -P3382
echo 'use nagiosql; SHOW TABLES;' | mysql -unagiosql -pn@gweb -hmynagiosxidb.pp.tvlport.com -P3382
echo 'use nagiosxi; select * from `xi_sysstat` limit 10;' | mysql -unagiosxi -pn@gweb -hmynagiosxidb.pp.tvlport.com -P3382
-
dwasswa
Re: Nagios does not perform checks after migrating to extern
Hi @corkyman,
It seems that you are trying to start nagios as a non root user instead of the root account.
Nagios needs to be started by root.
Please do that and let me know if it solves your issue.
It seems that you are trying to start nagios as a non root user instead of the root account.
Nagios needs to be started by root.
Please do that and let me know if it solves your issue.
Re: Nagios does not perform checks after migrating to extern
This does not make much sense to me.
1. The behavior should not depend on what db is used--local or external. It is working fine now with a local one.
2. I talked to the person who installed Nagios and he said "installed by root via a script - if I recall correctly - but it created the nagios user as part of that script and gave the nagios user all permissions to run"
I need a definitive way to test if db is useable before my next attempt. It would be a scheduled activity and I can't have it fail again.
1. The behavior should not depend on what db is used--local or external. It is working fine now with a local one.
2. I talked to the person who installed Nagios and he said "installed by root via a script - if I recall correctly - but it created the nagios user as part of that script and gave the nagios user all permissions to run"
I need a definitive way to test if db is useable before my next attempt. It would be a scheduled activity and I can't have it fail again.
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Nagios does not perform checks after migrating to extern
How large is your system? On very large systems (defined by number of checks) we would suggest NOT offloading the db. I wouldn't expect to see the errors you are seeing though.
Is https://assets.nagios.com/downloads/nag ... Server.pdf the documentation you are following while moving the db remote?
Can you PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.
You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh
That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
Is https://assets.nagios.com/downloads/nag ... Server.pdf the documentation you are following while moving the db remote?
Can you PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.
You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh
That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
Re: Nagios does not perform checks after migrating to extern
I sent the profile. Our system is under 10K checks but growing. Actually, I ran a nagios GUI performance comparison before and after migration by driving the common GUI navigational tasks and the performance on the external db was 20% faster. There is also an issue of internal DBAs support so we'd like to stick to the external db. What is the system size when I should be concerned about external db?
Yes, the document you referred to is what I used to make changes to nagios. This worked well in dev and preprod environments.
Yes, the document you referred to is what I used to make changes to nagios. This worked well in dev and preprod environments.
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Nagios does not perform checks after migrating to extern
I already PMed you about the lack of a profile, but we generally use 14K checks as an estimate, but you could potentially keep seeing improvements for higher numbers of checks. If you're benchmarking, I'm not going to argue with your environment.corkyman wrote:What is the system size when I should be concerned about external db?
As you running SELinux? Do you have a modified sudoers? It really doesn't make any sense why an offloaded db would cause permissions issues.
I do think the timing issue might still be worth looking into.
Can you run each of these from the two servers, except change the mysql command as needed for the two different servers...and no php check needed on the remote server.
# date
# hwclock
# grep date.timezone /etc/php.ini
# mysql -unagiosxi -pn@gweb -e "SELECT NOW();"
# service ntpd status
Also, can you send a screenshot of http://YOURSERVER/nagiosxi/admin/globalconfig.php ?
Re: Nagios does not perform checks after migrating to extern
I have now attached the profile here instead of PMing.
I am running Red Hat Enterprise Linux Server release 6.8 (Santiago)
I believe I do have modified sudoers.
I ma now willing to reconfigure my prod failover server temporarily for a test if you can do a shared session and look around.
Please let me know.
I am running Red Hat Enterprise Linux Server release 6.8 (Santiago)
I believe I do have modified sudoers.
I ma now willing to reconfigure my prod failover server temporarily for a test if you can do a shared session and look around.
Please let me know.
You do not have the required permissions to view the files attached to this post.