Page 1 of 1
Nagios Monitoring Engine Stops On Its Own
Posted: Fri Dec 06, 2019 4:45 pm
by nicholashadaway
I have now logged in a couple of times to my monitoring instance and found that the "Montioring Engine Process" is stopped even though I am not the one stopping the engine.
Does Nagios auto-shutdown if load is too high?
What are the possibile scenarios?
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Fri Dec 06, 2019 5:23 pm
by ssax
No auto-shutdown inherently built in, are you seeing any segfaults or anything in your
/usr/local/nagios/var/nagios.log?
Please PM a copy of your profile, you can download it from
Admin > System Profile > Download Profile.
As root, please send the output of these commands (hopefully when it's already stopped on it's own and you've taken no corrective action):
Please include the output of these commands as well (run as root):
Code: Select all
sysctl -p
ulimit -a
chage -l nagios
su -s /bin/bash -c 'ulimit -a' nagios
su -s /bin/bash -c 'ulimit -a' mysql
If you have these files, please attach:
Code: Select all
/etc/init.d/npcd
/etc/init.d/nagios
Additionally, please send the output of these commands (as root):
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
Then run this command:
Code: Select all
grep mysql /usr/local/nagiosxi/html/config.inc.php | wc -l
If it outputs the number 2, run the command below as well and include the output, if it outputs anything other than 2 - don't run the command. (some XI systems use both mysql and postgresql if they were install prior to XI 5.0 and then upgraded from there).
Code: Select all
echo "SELECT relname as Table, pg_size_pretty(pg_total_relation_size(relid)) As Size, pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as ExternalSize FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;" | psql nagiosxi nagiosxi
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Mon Dec 09, 2019 11:04 am
by ssax
Profile received, please send all the rest of the information requested as well. The only things I'd like to see when the issue is occurring is these:
Code: Select all
ps aux
ipcs -q
top -n3
df -h
df -i
tail /var/log/mariadb/mariadb.log
The rest of the information shouldn't change and could point us in a direction sooner so please send it/attach the files.
Looking at your profile I'm wondering if it was your ramdisk that filled up, grab the output above once it occurs and BEFORE you do any remediation.
Additionally, do you see anything in
/var/log/mariadb/mariadb.log now that could indicate an issue?
Thank you!
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Tue Dec 10, 2019 11:23 am
by nicholashadaway
I don't have it in an error state currently.
as soon as I catch it again, I will be glad to do this.
I did find my test instance of nagios failed this morning though. Attached is a screenshot showing the eventlog catching a "SIGTERM"
I will send you the profile for that machine, as well as the output of the commands you specified in PM.
Thank you.
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Tue Dec 10, 2019 4:57 pm
by mbellerue
It will probably be best to keep the two machines separate, because we don't know that the two instances are failing for the same reason. We'll keep this thread open and wait to hear back on the original instance of Nagios.
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Thu Dec 12, 2019 5:06 pm
by ssax
Please send the output of these commands now though:
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
sysctl -p
ulimit -a
chage -l nagios
su -s /bin/bash -c 'ulimit -a' nagios
su -s /bin/bash -c 'ulimit -a' mysql
Only provide these if it occurs again (before any remediation):
Code: Select all
ps aux
ipcs -q
top -n3
df -h
df -i
tail /var/log/mariadb/mariadb.log
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Fri Dec 20, 2019 5:20 pm
by nicholashadaway
I need both instances to get attention. The 2nd instance exhibits the problem more often than the primary instance. (primary instance has not shut down since this ticket was opened) Can you please still respond to the data I provided relating to the 2nd instance?
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Mon Dec 23, 2019 12:49 pm
by mbellerue
Absolutely. If the 2nd instance is the larger offender, let's tackle that one. ssax posted some information gathering commands above. Can you check out his post and let us know the results?
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Thu Jan 02, 2020 10:11 am
by nicholashadaway
I believe I figured out what the problem was. After performing OS updates, my Linux team would reboot the server.
After reboot, the Nagios Monitoring Engine would be in a stopped state.
It appears that the default setting is to NOT start up the nagios engine at boot.
After running "systemctl enable nagios" things work as expected after a reboot.
Re: Nagios Monitoring Engine Stops On Its Own
Posted: Thu Jan 02, 2020 10:35 am
by mbellerue
Excellent, I'm glad you were able to get this resolved! Also thank you for posting the solution back to the forums! I will go ahead and close this thread.