Nagios Monitoring Engine Stops On Its Own
-
nicholashadaway
- Posts: 31
- Joined: Thu Sep 05, 2019 1:03 pm
Nagios Monitoring Engine Stops On Its Own
I have now logged in a couple of times to my monitoring instance and found that the "Montioring Engine Process" is stopped even though I am not the one stopping the engine.
Does Nagios auto-shutdown if load is too high?
What are the possibile scenarios?
Does Nagios auto-shutdown if load is too high?
What are the possibile scenarios?
Re: Nagios Monitoring Engine Stops On Its Own
No auto-shutdown inherently built in, are you seeing any segfaults or anything in your /usr/local/nagios/var/nagios.log?
Please PM a copy of your profile, you can download it from Admin > System Profile > Download Profile.
As root, please send the output of these commands (hopefully when it's already stopped on it's own and you've taken no corrective action):
Please include the output of these commands as well (run as root):
If you have these files, please attach:
Additionally, please send the output of these commands (as root):
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password
Then run this command:
If it outputs the number 2, run the command below as well and include the output, if it outputs anything other than 2 - don't run the command. (some XI systems use both mysql and postgresql if they were install prior to XI 5.0 and then upgraded from there).
Please PM a copy of your profile, you can download it from Admin > System Profile > Download Profile.
As root, please send the output of these commands (hopefully when it's already stopped on it's own and you've taken no corrective action):
Code: Select all
ps aux
ipcs -qCode: Select all
sysctl -p
ulimit -a
chage -l nagios
su -s /bin/bash -c 'ulimit -a' nagios
su -s /bin/bash -c 'ulimit -a' mysqlCode: Select all
/etc/init.d/npcd
/etc/init.d/nagios- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --tableCode: Select all
grep mysql /usr/local/nagiosxi/html/config.inc.php | wc -lCode: Select all
echo "SELECT relname as Table, pg_size_pretty(pg_total_relation_size(relid)) As Size, pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as ExternalSize FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;" | psql nagiosxi nagiosxiRe: Nagios Monitoring Engine Stops On Its Own
Profile received, please send all the rest of the information requested as well. The only things I'd like to see when the issue is occurring is these:
The rest of the information shouldn't change and could point us in a direction sooner so please send it/attach the files.
Looking at your profile I'm wondering if it was your ramdisk that filled up, grab the output above once it occurs and BEFORE you do any remediation.
Additionally, do you see anything in /var/log/mariadb/mariadb.log now that could indicate an issue?
Thank you!
Code: Select all
ps aux
ipcs -q
top -n3
df -h
df -i
tail /var/log/mariadb/mariadb.log
Looking at your profile I'm wondering if it was your ramdisk that filled up, grab the output above once it occurs and BEFORE you do any remediation.
Additionally, do you see anything in /var/log/mariadb/mariadb.log now that could indicate an issue?
Thank you!
-
nicholashadaway
- Posts: 31
- Joined: Thu Sep 05, 2019 1:03 pm
Re: Nagios Monitoring Engine Stops On Its Own
I don't have it in an error state currently.
as soon as I catch it again, I will be glad to do this.
I did find my test instance of nagios failed this morning though. Attached is a screenshot showing the eventlog catching a "SIGTERM"
I will send you the profile for that machine, as well as the output of the commands you specified in PM.
Thank you.
as soon as I catch it again, I will be glad to do this.
I did find my test instance of nagios failed this morning though. Attached is a screenshot showing the eventlog catching a "SIGTERM"
I will send you the profile for that machine, as well as the output of the commands you specified in PM.
Thank you.
You do not have the required permissions to view the files attached to this post.
Re: Nagios Monitoring Engine Stops On Its Own
It will probably be best to keep the two machines separate, because we don't know that the two instances are failing for the same reason. We'll keep this thread open and wait to hear back on the original instance of Nagios.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Nagios Monitoring Engine Stops On Its Own
Please send the output of these commands now though:
Only provide these if it occurs again (before any remediation):
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
sysctl -p
ulimit -a
chage -l nagios
su -s /bin/bash -c 'ulimit -a' nagios
su -s /bin/bash -c 'ulimit -a' mysqlCode: Select all
ps aux
ipcs -q
top -n3
df -h
df -i
tail /var/log/mariadb/mariadb.log-
nicholashadaway
- Posts: 31
- Joined: Thu Sep 05, 2019 1:03 pm
Re: Nagios Monitoring Engine Stops On Its Own
I need both instances to get attention. The 2nd instance exhibits the problem more often than the primary instance. (primary instance has not shut down since this ticket was opened) Can you please still respond to the data I provided relating to the 2nd instance?
Re: Nagios Monitoring Engine Stops On Its Own
Absolutely. If the 2nd instance is the larger offender, let's tackle that one. ssax posted some information gathering commands above. Can you check out his post and let us know the results?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
nicholashadaway
- Posts: 31
- Joined: Thu Sep 05, 2019 1:03 pm
Re: Nagios Monitoring Engine Stops On Its Own
I believe I figured out what the problem was. After performing OS updates, my Linux team would reboot the server.
After reboot, the Nagios Monitoring Engine would be in a stopped state.
It appears that the default setting is to NOT start up the nagios engine at boot.
After running "systemctl enable nagios" things work as expected after a reboot.
After reboot, the Nagios Monitoring Engine would be in a stopped state.
It appears that the default setting is to NOT start up the nagios engine at boot.
After running "systemctl enable nagios" things work as expected after a reboot.
Re: Nagios Monitoring Engine Stops On Its Own
Excellent, I'm glad you were able to get this resolved! Also thank you for posting the solution back to the forums! I will go ahead and close this thread.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!