Core 3.5 generic pre-flight verification error

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
v3-alex
Posts: 9
Joined: Fri Nov 21, 2014 9:58 am

Re: Core 3.5 generic pre-flight verification error

Post by v3-alex »

Code: Select all

ls -la /usr/local/nagios/var/ndomod.tmp
returns the following output:

Code: Select all

-rw-r--r-- 1 root root 0 Aug 16  2011 /usr/local/nagios/var/ndomod.tmp

Code: Select all

ls -lad /usr/local/nagios/var/
returns the following output:

Code: Select all

drwxrwxr-x 5 nagios nagios 4096 Dec  2 07:57 /usr/local/nagios/var/

Code: Select all

ps -ef | grep bin/ndo
returns:

Code: Select all

nagios   22706     1  0 Nov28 ?        00:00:00 /usr/local/nagios/bin/ndo2db -c                                              /usr/local/nagios/etc/ndo2db.cfg

Code: Select all

service ndo2db status
returns:

Code: Select all

ndo2db (pid 22706) is running...
Note: this service was not running when I first ran the status command. After I executed "service ndo2db start" command the service started up, however, Nagios still returned the same error as before.

Code: Select all

ls -la /usr/local/nagios/var/ndo.sock
returns:

Code: Select all

srwxr-xr-x 1 nagios nagios 0 Nov 28 10:36 /usr/local/nagios/var/ndo.sock
Let me know what you think.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Core 3.5 generic pre-flight verification error

Post by sreinhardt »

OK let's take it from the top:

ls -la /usr/local/nagios/var/ndomod.tmp should return:

Code: Select all

-rw-r--r-- 1 nagios nagios 0 Nov 20 15:10 /usr/local/nagios/var/ndomod.tmp
So for some reason ndomod.tmp is owned as root:root but otherwise permissions are fine. I believe this get's created on nagios start with ndomod loaded, so let's stop nagios and remove that file.

Code: Select all

service nagios stop
rm -f usr/local/nagios/var/ndomod.tmp
service nagios start
ls -la /usr/local/nagios/var/ndomod.tmp
var/ permissions look great! Your ndo2db process also looks fine considering it is started at the same time associated files were created. I am going to have to assume you started it on the 28th since everything points that way. ndo.sock also looks great as it is created with the right time and permissions. I think the ndomod.tmp is probably our culprit here causing the crash, especially after looking at all of this.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
v3-alex
Posts: 9
Joined: Fri Nov 21, 2014 9:58 am

Re: Core 3.5 generic pre-flight verification error

Post by v3-alex »

Code: Select all

service nagios stop
rm -f usr/local/nagios/var/ndomod.tmp

This part went fine, I was able to delete the file. However, I am still unable to start the nagios service due to a configuration error.

service nagios start returns the following error:

Code: Select all

Starting nagios:CONFIG ERROR!  Start aborted.  Check your Nagios configuration.
Running the NAGIOS configuration check results in exactly the same error as before, only now I am unable to start up the service altogether.

Code: Select all

ls -la /usr/local/nagios/var/ndomod.tmp
returns:

Code: Select all

ls: /usr/local/nagios/var/ndomod.tmp: No such file or directory
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Core 3.5 generic pre-flight verification error

Post by sreinhardt »

Let's get another strace just like before, and if you could send a tar of your configs. I think it's easiest if you just PM them over and I can attempt the verify myself and see whats happening directly.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
v3-alex
Posts: 9
Joined: Fri Nov 21, 2014 9:58 am

Re: Core 3.5 generic pre-flight verification error

Post by v3-alex »

Sent both archives, let me know what you think.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Core 3.5 generic pre-flight verification error

Post by sreinhardt »

Got them, have not had a chance to look at them yet today, it might be a tomorrow morning thing at this point. Just wanted to let you know that I have them, and will post back after looking at them!
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
v3-alex
Posts: 9
Joined: Fri Nov 21, 2014 9:58 am

Re: Core 3.5 generic pre-flight verification error

Post by v3-alex »

Spenser thank you for your assistance in this matter. We troubleshot this problem with a Linux dev yesterday and found the culprit. Basically the objects.cache file located in /usr/local/nagios/var/ was last time stamped October 13th. Looking at the object configuration files in the /usr/local/nagios/etc/objects/linux and /usr/local/nagios/etc/objects/windows we noticed that there were a slew of hosts that were added after this date. After appending the filenames with .old and running the pre-flight verification again the error disappeared. Renaming the files back one by one and running the verification check each time we identified the culprit config file. The file did not have any hostgroups specified, so after we modified the configuration the pre-flight check went ok.

Thanks again for taking the time to troubleshoot this issue guys!
Locked