Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Note: this service was not running when I first ran the status command. After I executed "service ndo2db start" command the service started up, however, Nagios still returned the same error as before.
-rw-r--r-- 1 nagios nagios 0 Nov 20 15:10 /usr/local/nagios/var/ndomod.tmp
So for some reason ndomod.tmp is owned as root:root but otherwise permissions are fine. I believe this get's created on nagios start with ndomod loaded, so let's stop nagios and remove that file.
service nagios stop
rm -f usr/local/nagios/var/ndomod.tmp
service nagios start
ls -la /usr/local/nagios/var/ndomod.tmp
var/ permissions look great! Your ndo2db process also looks fine considering it is started at the same time associated files were created. I am going to have to assume you started it on the 28th since everything points that way. ndo.sock also looks great as it is created with the right time and permissions. I think the ndomod.tmp is probably our culprit here causing the crash, especially after looking at all of this.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Let's get another strace just like before, and if you could send a tar of your configs. I think it's easiest if you just PM them over and I can attempt the verify myself and see whats happening directly.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Got them, have not had a chance to look at them yet today, it might be a tomorrow morning thing at this point. Just wanted to let you know that I have them, and will post back after looking at them!
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Spenser thank you for your assistance in this matter. We troubleshot this problem with a Linux dev yesterday and found the culprit. Basically the objects.cache file located in /usr/local/nagios/var/ was last time stamped October 13th. Looking at the object configuration files in the /usr/local/nagios/etc/objects/linux and /usr/local/nagios/etc/objects/windows we noticed that there were a slew of hosts that were added after this date. After appending the filenames with .old and running the pre-flight verification again the error disappeared. Renaming the files back one by one and running the verification check each time we identified the culprit config file. The file did not have any hostgroups specified, so after we modified the configuration the pre-flight check went ok.
Thanks again for taking the time to troubleshoot this issue guys!