Hello Team,
nagiosxi server is frequently getting overload,i have attached the system profile and PFB screen shot also
Please suggest i guess mrtg will be the issue
TOP command output:
top
top - 15:54:40 up 38 days, 4:51, 2 users, load average: 160.69, 134.33, 135.83
Tasks: 1116 total, 180 running, 936 sleeping, 0 stopped, 0 zombie
Cpu(s): 70.8%us, 28.4%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.8%si, 0.0%st
Mem: 65978484k total, 13197912k used, 52780572k free, 297036k buffers
Swap: 33554428k total, 2916k used, 33551512k free, 10255616k cached
Nagios server getting over load
-
manimurugesan
- Posts: 145
- Joined: Wed Oct 03, 2018 9:15 am
Nagios server getting over load
You do not have the required permissions to view the files attached to this post.
Last edited by benjaminsmith on Tue Jul 30, 2019 4:50 pm, edited 1 time in total.
Reason: saved profile
Reason: saved profile
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Nagios server getting over load
Hi @manimurugesan,
Have you made any changes to the server or Apache configurations recently. After looking over the system profile, I see a number of issues.
1. The error your seeing in the uploaded image, specifically "Error: Could not parse XML output from https://server" could be an SSL setting on the server. Follow the document below to make sure all the settings are correct.
How to Configure SSL/TLS
3. Post the output of the following command to check the size of your database tables.
5. Lastly, run the following to re-start Nagios and clear the message queue.
Have you made any changes to the server or Apache configurations recently. After looking over the system profile, I see a number of issues.
1. The error your seeing in the uploaded image, specifically "Error: Could not parse XML output from https://server" could be an SSL setting on the server. Follow the document below to make sure all the settings are correct.
How to Configure SSL/TLS
2. Run the database repair script, log in as root an run the following command:./nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed
Code: Select all
/usr/local/nagiosxi/scripts/repair_databases.sh
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -uroot -pnagiosxi --table
4. It can't resolve the connection to the license server, so you may have a DNS issue. What is the output of the following command:Couldn't resolve host 'api.nagios.com'
Code: Select all
nslookup api.nagios.comCode: Select all
service crond stop
service npcd stop
service nagios stop
service ndo2db stop
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
service mysqld restart
service ndo2db start
service nagios start
service npcd start
service crond star
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
manimurugesan
- Posts: 145
- Joined: Wed Oct 03, 2018 9:15 am
Re: Nagios server getting over load
Hello benjamin,
Please find the below output and i have attached output of command to check the size of database tables.
nslookup api.nagios.com
Server: Name server(which server mentioned in /etc/resolve.conf)
Address: Name server
** server can't find api.nagios.com: NXDOMAIN
i did database repair but still server load is showing high .
Could you please suggest what action need to be taken for this ?
Please find the below output and i have attached output of command to check the size of database tables.
nslookup api.nagios.com
Server: Name server(which server mentioned in /etc/resolve.conf)
Address: Name server
** server can't find api.nagios.com: NXDOMAIN
i did database repair but still server load is showing high .
Could you please suggest what action need to be taken for this ?
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Nagios server getting over load
Hello @manimurugesan,
After further review, you have SELinux enabled on the server and this is preventing mrtg from functioning properly.
We also noticed that you have gnome installed on the server, and this may result in decreased performance. We recommend a clean, minimal installation for Nagios XI.
Also, you have DNS setup internally and should configure an external DNS. Nagios XI t cannot call out to the licensing server ( api.nagios.com ).
If you continue to experience high load, please run the following top command and post the full output so we can review the processes.
Thanks.
After further review, you have SELinux enabled on the server and this is preventing mrtg from functioning properly.
See: Disabling SELinuxAug 9 10:20:27 MPHSCRLS0739 setroubleshoot: failed to retrieve rpm info for /var/lib/mrtg
Aug 9 10:20:27 MPHSCRLS0739 setroubleshoot: SELinux is preventing /usr/bin/perl from write access on the directory /var/lib/mrtg. For complete SELinux messages. run sealert -l 64b1ed28-d309-4034-aac4-eaa7bff2a4dd
We also noticed that you have gnome installed on the server, and this may result in decreased performance. We recommend a clean, minimal installation for Nagios XI.
Also, you have DNS setup internally and should configure an external DNS. Nagios XI t cannot call out to the licensing server ( api.nagios.com ).
If you continue to experience high load, please run the following top command and post the full output so we can review the processes.
Code: Select all
top -n 1
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
manimurugesan
- Posts: 145
- Joined: Wed Oct 03, 2018 9:15 am
Re: Nagios server getting over load
Hello benjamin,
We have tried all the commands which given by you but still issue is persist and i have checked selinux status also it is in disabled state only .PFB output for the same.
# sestatus
SELinux status: disabled
I have attached the top -n 1 command output ,please let us know what action need to be taken from our end ?
We have tried all the commands which given by you but still issue is persist and i have checked selinux status also it is in disabled state only .PFB output for the same.
# sestatus
SELinux status: disabled
I have attached the top -n 1 command output ,please let us know what action need to be taken from our end ?
You do not have the required permissions to view the files attached to this post.
Re: Nagios server getting over load
The output of the top command shows that the highest 4 processes at that time was the MRTG process so we need to look at that.
Can you run the following commands as root and post the the /tmp/mrtg.txt file here?
The following entries in the /var/log/messages file
Are coming from the setroubleshootd daemon that is running on your server.
You should configure it so it will allow the MRTG process to access the /var/lib/mrtg folder which could be the cause of the high load for the MTRG process.
Can you run the following commands as root and post the the /tmp/mrtg.txt file here?
Code: Select all
LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg -debug=cfg,base,log &> /tmp/mrtg.txt
LANG=C LC_ALL=C /usr/bin/mrtg &>> /tmp/mrtg.txt
LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg.lock --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios &>> /tmp/mrtg.txt
{ time LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg 2>1 ; } 2>> /tmp/mrtg.txtCode: Select all
Aug 9 10:20:27 MPHSCRLS0739 setroubleshoot: failed to retrieve rpm info for /var/lib/mrtg
Aug 9 10:20:27 MPHSCRLS0739 setroubleshoot: SELinux is preventing /usr/bin/perl from write access on the directory /var/lib/mrtg. For complete SELinux messages. run sealert -l 64b1ed28-d309-4034-aac4-eaa7bff2a4dd
Aug 9 10:20:27 MPHSCRLS0739 python: SELinux is preventing /usr/bin/perl from write access on the directory /var/lib/mrtg.#012#012***** Plugin restorecon (94.8 confidence) suggests ************************#012#012If you want to fix the label. #012/var/lib/mrtg default label should be mrtg_var_lib_t.#012Then you can run restorecon.#012Do#012# /sbin/restorecon -v /var/lib/mrtg#012#012***** Plugin catchall_labels (5.21 confidence) suggests *******************#012#012If you want to allow perl to have write access on the mrtg directory#012Then you need to change the label on /var/lib/mrtg#012Do#012# semanage fcontext -a -t FILE_TYPE '/var/lib/mrtg'#012where FILE_TYPE is one of the following: httpd_sys_content_t, mrtg_lock_t, mrtg_log_t, mrtg_var_lib_t, var_lock_t, var_log_t, var_run_t.#012Then execute:#012restorecon -v '/var/lib/mrtg'#012#012#012***** Plugin catchall (1.44 confidence) suggests **************************#012#012If you believe that perl should be allowed write access on the mrtg directory by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'mrtg' --raw | audit2allow -M my-mrtg#012# semodule -i my-mrtg.pp#012
Aug 9 10:20:27 MPHSCRLS0739 setroubleshoot: failed to retrieve rpm info for /var/lib/mrtg
Aug 9 10:20:27 MPHSCRLS0739 setroubleshoot: SELinux is preventing /usr/bin/perl from write access on the directory /var/lib/mrtg. For complete SELinux messages. run sealert -l 64b1ed28-d309-4034-aac4-eaa7bff2a4dd
Aug 9 10:20:27 MPHSCRLS0739 python: SELinux is preventing /usr/bin/perl from write access on the directory /var/lib/mrtg.#012#012***** Plugin restorecon (94.8 confidence) suggests ************************#012#012If you want to fix the label. #012/var/lib/mrtg default label should be mrtg_var_lib_t.#012Then you can run restorecon.#012Do#012# /sbin/restorecon -v /var/lib/mrtg#012#012***** Plugin catchall_labels (5.21 confidence) suggests *******************#012#012If you want to allow perl to have write access on the mrtg directory#012Then you need to change the label on /var/lib/mrtg#012Do#012# semanage fcontext -a -t FILE_TYPE '/var/lib/mrtg'#012where FILE_TYPE is one of the following: httpd_sys_content_t, mrtg_lock_t, mrtg_log_t, mrtg_var_lib_t, var_lock_t, var_log_t, var_run_t.#012Then execute:#012restorecon -v '/var/lib/mrtg'#012#012#012***** Plugin catchall (1.44 confidence) suggests **************************#012#012If you believe that perl should be allowed write access on the mrtg directory by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'mrtg' --raw | audit2allow -M my-mrtg#012# semodule -i my-mrtg.pp#012Code: Select all
setroub+ 20792 1 13 10:24 ? 00:00:01 /usr/bin/python -Es /usr/sbin/setroubleshootd -fBe sure to check out our Knowledgebase for helpful articles and solutions!