Page 1 of 1
Nagios XI monitoring Engine dies after apply config
Posted: Tue Nov 20, 2018 12:30 pm
by aeckland1
When we apply the config in CCM the monitoring engine failes to start. On the command line it failes to start as well. I can get it to start if I stop ndo2db then start the monitoring engine. The preflight runs through with no errors. This system is running 3000 service checks and 300 hosts, the DB is on a separate host. Running Nagios core 4.4.2 and Xiamen 5.5.3. I have looked at this for about a day also rebuilt the server an Iām still seeing the same issue. Any help would be appreciated.
Thanks,
August
Re: Nagios XI monitoring Engine dies after apply config
Posted: Tue Nov 20, 2018 12:57 pm
by benjaminsmith
Hi August,
It's possible you have some crashed database tables. Depending on which database, please check the log files for error messages:
Code: Select all
/var/log/mariadb/mariadb.log
/var/log/mysqld.log
If you have any error messages related to crashed tables, run the repair script:
Code: Select all
service mysqld stop
/usr/local/nagiosxi/scripts/repairmysql.sh nagios
service mysqld start
Repairing the Nagios XI Database
https://assets.nagios.com/downloads/nag ... tabase.pdf
If this is not the issue, then please send us a system profile so we can take a closer look at your logs.
To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to a cloud storage of your choice. You can share a link with me in a personal message.
After you upload the profile please post something in this thread to bring it up in the support queue.
Re: Nagios XI monitoring Engine dies after apply config
Posted: Tue Nov 20, 2018 3:35 pm
by aeckland1
Hi,
There were no error in the db logs, so I will send over the profile link in a personal message.
- August
Re: Nagios XI monitoring Engine dies after apply config
Posted: Tue Nov 20, 2018 6:05 pm
by benjaminsmith
Hi August,
We took a look at your profile. Let's try to log in as nagios and run the re-configure script. Also, did you recently upgrade to a new version? If so, what version did you upgrade from?
1. Login as Nagios
2. Run the re-configure script
Code: Select all
/usr/local/nagiosxi/scripts/reconfigure_nagios.sh
If you're still having trouble, run the pre-flight check again and let us know if you get any errors. Also, it looks like the database is on another server, can you post the log file for us to review? Thanks.
Pre-flight Configuration Check:
Code: Select all
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Re: Nagios XI monitoring Engine dies after apply config
Posted: Tue Nov 20, 2018 6:21 pm
by aeckland1
Still no luck see below:
Also this is a fresh install.
Code: Select all
bash-4.2$ service nagios status
Redirecting to /bin/systemctl status nagios.service
ā nagios.service - Nagios Core 4.4.2
Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2018-11-20 15:11:54 PST; 8min ago
Docs: https://www.nagios.org/documentation
Process: 11916 ExecStopPost=/bin/rm -f /usr/local/nagios/var/rw/nagios.cmd (code=exited, status=0/SUCCESS)
Process: 11873 ExecStop=/bin/kill -s TERM ${MAINPID} (code=exited, status=1/FAILURE)
Process: 11841 ExecStart=/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Process: 11839 ExecStartPre=/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Main PID: 11843 (code=killed, signal=ABRT)
bash-4.2$ /usr/local/nagiosxi/scripts/reconfigure_nagios.sh
--- reset_config_perms.sh ------------
> Setting CCM script permissions
> Setting script permissions
> Setting special component script permissions
> Setting configuration file/directory permissions
> Setting perfdata directory and RRD permissions
> Setting NOM checkpoint user:group permissions
> + Setting CCM configuration file user:group permissions
> + Setting Recurring Downtime file user:group permissions
> + Setting BPI configuration file user:group permissions
--------------------------------------
--- ccm_import.php -------------------
> Setting import directory: /usr/local/nagios/etc/import/
> Importing config files into the CCM
No files to import
--------------------------------------
--- ccm_export.php -------------------
> Writing CCM configuration to Nagios files
Finished writing out configuraton
--------------------------------------
--------------------------------------
> Verifying configuration with Nagios Core
> Output:
Nagios Core 4.4.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2018-08-16
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 2864 services.
Checked 360 hosts.
Checked 30 host groups.
Checked 4 service groups.
Checked 14 contacts.
Checked 12 contact groups.
Checked 171 commands.
Checked 10 time periods.
Checked 145 host escalations.
Checked 1388 service escalations.
Checking for circular paths...
Checked 360 hosts
Checked 3432 service dependencies
Checked 0 host dependencies
Checked 10 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
> Return Code: 0
--------------------------------------
bash-4.2$ service nagios status
Redirecting to /bin/systemctl status nagios.service
ā nagios.service - Nagios Core 4.4.2
Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2018-11-20 15:20:08 PST; 974ms ago
Docs: https://www.nagios.org/documentation
Process: 14355 ExecStopPost=/bin/rm -f /usr/local/nagios/var/rw/nagios.cmd (code=exited, status=0/SUCCESS)
Process: 14320 ExecStop=/bin/kill -s TERM ${MAINPID} (code=exited, status=1/FAILURE)
Process: 14277 ExecStart=/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Process: 14275 ExecStartPre=/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Main PID: 14279 (code=killed, signal=ABRT)
Re: Nagios XI monitoring Engine dies after apply config
Posted: Wed Nov 21, 2018 9:53 am
by lmiltchev
Can you PM me your profile (profile.zip)?
Admin > System Config > System Profile > Download Profile
Also, PM me that /etc/init.d/nagios and /lib/systemd/system/nagios.service, whichever exists on your system. Thanks!
Re: Nagios XI monitoring Engine dies after apply config
Posted: Wed Nov 21, 2018 11:35 am
by aeckland1
Sent request info over, Thanks
Re: Nagios XI monitoring Engine dies after apply config
Posted: Wed Nov 21, 2018 2:29 pm
by benjaminsmith
Hi August,
I believe we have an issue with the nagios.lock file path. Please make a backup, and if you are using a VM take a snapshot.
1. Please verify that you do not have the older nagios startup script at:
/etc/init.d/nagios
If not, stop Nagios and remove the lock file:
Code: Select all
service nagios stop
service ndo2db stop
rm -f /var/run/nagios.lock
Change the lock file path in
/usr/local/nagios/etc/nagios.cfg
Code: Select all
lock_file=/var/run/nagios.lock
CHANGE TO:
lock_file = /usr/local/nagios/var/nagios.lock
Restart Nagios
Code: Select all
service ndo2db start
service nagios start
If this does not correct the issue, please open a support ticket and we may schedule a remote session to correct the problem.
Backing Up Nagios Xi
https://assets.nagios.com/downloads/nag ... ios-XI.pdf
Re: Nagios XI monitoring Engine dies after apply config
Posted: Mon Nov 26, 2018 11:29 am
by aeckland1
This did not work, opening a support ticket
Re: Nagios XI monitoring Engine dies after apply config
Posted: Mon Nov 26, 2018 11:34 am
by benjaminsmith
This did not work, opening a support ticket
OK. I will go head and close this topic.