can't get the nagios service to run

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

can't get the nagios service to run

Post by benhank »

I have upgraded from version 5.4.12 to version 5.7.2
The nagios service doesn't run:

Code: Select all

[root@lkennagiost01 nagiosxi]# service nagios start
Starting nagios: done.
[root@lkennagiost01 nagiosxi]# service nagios status
nagios is not running
[root@lkennagiost01 nagiosxi]#
I had an issue similar to this a while ago, and i modified to following files to point to the correct location of the nagios.lock file:

Code: Select all

/usr/local/nagiosxi/scripts/nom_restore_nagioscore_checkpoint.sh
/usr/local/nagiosxi/scripts/nom_restore_nagioscore_checkpoint_specific.sh
/usr/local/nagios/etc/nagios.cfg
/etc/rc.d/init.d/nagios
are all set to :

Code: Select all

/var/run/nagios.lock
The file is there in the directory and has a pid.

Code: Select all

[root@lkennagiost01 ~]# tail -20 /usr/local/nagios/var/nagios.log
[1599236832] wproc: Registry request: name=Core Worker 19147;pid=19147
[1599236832] wproc: Registry request: name=Core Worker 19148;pid=19148
[1599236832] wproc: Registry request: name=Core Worker 19149;pid=19149
[1599236832] wproc: Registry request: name=Core Worker 19150;pid=19150
[1599236832] wproc: Registry request: name=Core Worker 19151;pid=19151
[1599236832] wproc: Registry request: name=Core Worker 19152;pid=19152
[1599236832] wproc: Registry request: name=Core Worker 19153;pid=19153
[1599236832] wproc: Registry request: name=Core Worker 19154;pid=19154
[1599236832] wproc: Registry request: name=Core Worker 19155;pid=19155
[1599236832] wproc: Registry request: name=Core Worker 19156;pid=19156
[1599236832] wproc: Registry request: name=Core Worker 19157;pid=19157
[1599236832] wproc: Registry request: name=Core Worker 19158;pid=19158
[1599236832] wproc: Registry request: name=Core Worker 19159;pid=19159
[1599236832] wproc: Registry request: name=Core Worker 19160;pid=19160
[1599236832] wproc: Registry request: name=Core Worker 19161;pid=19161
[1599236832] wproc: Registry request: name=Core Worker 19162;pid=19162
[1599236832] wproc: Registry request: name=Core Worker 19163;pid=19163
[1599236832] Error: Could not load module '/usr/local/nagios/bin/ndomod.o' -> /usr/local/nagios/bin/ndomod.o: cannot open shared object file: No such file or directory
[1599236832] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1599236832] Error: Module loading failed. Aborting.
what am I missing?
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: can't get the nagios service to run

Post by benjaminsmith »

Hi Ben,

So in 5.7.x , Nagios XI is now using a new backend database application called ndo3, and it looks like your system is still loading the older broker module.

Code: Select all

[1599236832] Error: Could not load module '/usr/local/nagios/bin/ndomod.o' -> /usr/local/nagios/bin/ndomod.o: cannot open shared object file: No such file or directory
[1599236832] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1599236832] Error: Module loading failed. Aborting.
Edit your /usr/local/nagios/etc/nagios.cfg and make sure this line is commented:

Code: Select all

#broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
And make sure this line is uncommented:

Code: Select all

broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
Then start the nagios service:

Code: Select all

systemctl start nagios
If you're still not able to get it started, please send us the profile. Thanks, Benjamin

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: can't get the nagios service to run

Post by benhank »

That seems to have done the trick, the line that you said to uncomment wasn't there , I added it and now the nagios service seems to be running.


However I have this issue:

Code: Select all

[1599247086] WARNING: RLIMIT_NPROC is 63414, total max estimated processes is 160244! You should increase your limits (ulimit -u, or limits.conf)
[1599247087] NDO-3: Database initialized
[1599247089] Successfully launched command file worker with pid 1215
And nothing shows up in the monitoring events que:
Capture.PNG
You do not have the required permissions to view the files attached to this post.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: can't get the nagios service to run

Post by benhank »

My profile is to big its 42 mb
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: can't get the nagios service to run

Post by benjaminsmith »

Hi,
That seems to have done the trick, the line that you said to uncomment wasn't there , I added it and now the nagios service seems to be running.
If everything is working, no need to send the profile. To increase it, edit the /etc/security/limits.conf file and add the following to the bottom of the file.

Code: Select all

*          soft     nproc          262144
*          hard     nproc          262144
Save the change and reboot the server for the change to take effect.

Reference:
https://www.thegeekdiary.com/how-to-set ... -rhel-567/
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: can't get the nagios service to run

Post by benhank »

It's not working, I ran the commands and rebooted, but here is the result:

Code: Select all

[root@lkennagiost01 ~]# service nagios status
nagios (pid 2834) is running...
[root@lkennagiost01 ~]# killall  -9 nagios
[root@lkennagiost01 ~]# service nagios start
Starting nagios: done.
[root@lkennagiost01 ~]#
Capture.PNG
You do not have the required permissions to view the files attached to this post.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: can't get the nagios service to run

Post by benhank »

so the service is running but I dont think the checks are being executed properly:
Capture.PNG
You do not have the required permissions to view the files attached to this post.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: can't get the nagios service to run

Post by benhank »

ok ive uploaded the profile
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: can't get the nagios service to run

Post by benjaminsmith »

Hi Ben,

The nagios service is running so checks are being processed but they are not being written to the database. Please take a backup or snapshot.

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

Then run the following command to truncate the nagios database, and then re-start the software stack and let me know if the check results are coming.

Truncate tables:

Code: Select all

echo "truncate table nagios_objects; truncate table nagios_hosts; truncate table nagios_hoststatus; truncate table nagios_services; truncate table nagios_servicestatus;" | mysql -u root -pnagiosxi nagios
Restart Nagiso XI ( Cent 6)

Code: Select all

service crond stop
service npcd stop
service nagios stop
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
service nagios start
service npcd start
service crond start
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: can't get the nagios service to run

Post by benhank »

I ran the commands and now its stuck on pending for about an hour:
Capture.PNG
You do not have the required permissions to view the files attached to this post.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
Locked