Nagios Monitoring Engine Stops On Its Own

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
nicholashadaway
Posts: 31
Joined: Thu Sep 05, 2019 1:03 pm

Nagios Monitoring Engine Stops On Its Own

Post by nicholashadaway »

I have now logged in a couple of times to my monitoring instance and found that the "Montioring Engine Process" is stopped even though I am not the one stopping the engine.

Does Nagios auto-shutdown if load is too high?
What are the possibile scenarios?
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagios Monitoring Engine Stops On Its Own

Post by ssax »

No auto-shutdown inherently built in, are you seeing any segfaults or anything in your /usr/local/nagios/var/nagios.log?

Please PM a copy of your profile, you can download it from Admin > System Profile > Download Profile.

As root, please send the output of these commands (hopefully when it's already stopped on it's own and you've taken no corrective action):

Code: Select all

ps aux
ipcs -q
Please include the output of these commands as well (run as root):

Code: Select all

sysctl -p
ulimit -a
chage -l nagios
su -s /bin/bash -c 'ulimit -a' nagios
su -s /bin/bash -c 'ulimit -a' mysql
If you have these files, please attach:

Code: Select all

/etc/init.d/npcd
/etc/init.d/nagios
Additionally, please send the output of these commands (as root):
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
Then run this command:

Code: Select all

grep mysql /usr/local/nagiosxi/html/config.inc.php | wc -l
If it outputs the number 2, run the command below as well and include the output, if it outputs anything other than 2 - don't run the command. (some XI systems use both mysql and postgresql if they were install prior to XI 5.0 and then upgraded from there).

Code: Select all

echo "SELECT relname as Table, pg_size_pretty(pg_total_relation_size(relid)) As Size, pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as ExternalSize FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;" | psql nagiosxi nagiosxi
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagios Monitoring Engine Stops On Its Own

Post by ssax »

Profile received, please send all the rest of the information requested as well. The only things I'd like to see when the issue is occurring is these:

Code: Select all

ps aux
ipcs -q
top -n3
df -h
df -i
tail /var/log/mariadb/mariadb.log
The rest of the information shouldn't change and could point us in a direction sooner so please send it/attach the files.

Looking at your profile I'm wondering if it was your ramdisk that filled up, grab the output above once it occurs and BEFORE you do any remediation.

Additionally, do you see anything in /var/log/mariadb/mariadb.log now that could indicate an issue?

Thank you!
nicholashadaway
Posts: 31
Joined: Thu Sep 05, 2019 1:03 pm

Re: Nagios Monitoring Engine Stops On Its Own

Post by nicholashadaway »

I don't have it in an error state currently.
as soon as I catch it again, I will be glad to do this.

I did find my test instance of nagios failed this morning though. Attached is a screenshot showing the eventlog catching a "SIGTERM"

I will send you the profile for that machine, as well as the output of the commands you specified in PM.

Thank you.
You do not have the required permissions to view the files attached to this post.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Nagios Monitoring Engine Stops On Its Own

Post by mbellerue »

It will probably be best to keep the two machines separate, because we don't know that the two instances are failing for the same reason. We'll keep this thread open and wait to hear back on the original instance of Nagios.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagios Monitoring Engine Stops On Its Own

Post by ssax »

Please send the output of these commands now though:

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
sysctl -p
ulimit -a
chage -l nagios
su -s /bin/bash -c 'ulimit -a' nagios
su -s /bin/bash -c 'ulimit -a' mysql
Only provide these if it occurs again (before any remediation):

Code: Select all

ps aux
ipcs -q
top -n3
df -h
df -i
tail /var/log/mariadb/mariadb.log
nicholashadaway
Posts: 31
Joined: Thu Sep 05, 2019 1:03 pm

Re: Nagios Monitoring Engine Stops On Its Own

Post by nicholashadaway »

I need both instances to get attention. The 2nd instance exhibits the problem more often than the primary instance. (primary instance has not shut down since this ticket was opened) Can you please still respond to the data I provided relating to the 2nd instance?
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Nagios Monitoring Engine Stops On Its Own

Post by mbellerue »

Absolutely. If the 2nd instance is the larger offender, let's tackle that one. ssax posted some information gathering commands above. Can you check out his post and let us know the results?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
nicholashadaway
Posts: 31
Joined: Thu Sep 05, 2019 1:03 pm

Re: Nagios Monitoring Engine Stops On Its Own

Post by nicholashadaway »

I believe I figured out what the problem was. After performing OS updates, my Linux team would reboot the server.
After reboot, the Nagios Monitoring Engine would be in a stopped state.

It appears that the default setting is to NOT start up the nagios engine at boot.
After running "systemctl enable nagios" things work as expected after a reboot.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Nagios Monitoring Engine Stops On Its Own

Post by mbellerue »

Excellent, I'm glad you were able to get this resolved! Also thank you for posting the solution back to the forums! I will go ahead and close this thread.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked