Page 1 of 2
Nagios is not proccesing
Posted: Tue Jul 28, 2020 3:16 am
by FCC_Nagios_Support
Hi,
I had a delayed response in Nagios, because I observed nagios.cmd is not being proccesed.
I got bellow error messages:
service nagios status
Jul 23 19:36:19 a2nagio001p.fcc.intfcc.local nagios[580]: WARNING: RLIMIT_NPROC is 95696, total max estimated processes is 360458! You should increase your l...ts.conf)
Hint: Some lines were ellipsized, use -l to show in full.
[root@a2nagio001p ~]# service ndo2db status -l
.
..
...
Jul 24 23:50:44 a2nagio001p.fcc.intfcc.local ndo2db[20858]: Message sent to queue.
Jul 24 23:50:44 a2nagio001p.fcc.intfcc.local ndo2db[20858]: Warning: queue send error, retrying...
ipcs -q
increasing
Thanks and Regards
Re: Nagios is not proccesing
Posted: Tue Jul 28, 2020 2:29 pm
by benjaminsmith
Hi,
Those error messages are indication that the server is not able to keep up processing incoming check results. What is the current check load (hosts + services) on this system?
Let's increase the kernel message queue settings and the max connections for the database, then restart and let me know what kind of improvement you see.
1. To increase the kernel message queue settings, follow the steps in the kb article below:
https://support.nagios.com/kb/article.php?id=139
2. To increase the max db connections, follow this guide:
https://support.nagios.com/kb/article/n ... s-513.html
3. Then run through the commands below to do a full restart (commands are for RHEL/Cent)
Code: Select all
systemctl stop crond
systemctl stop npcd
systemctl stop nagios
systemctl stop ndo2db
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
systemctl restart mariadb
systemctl start ndo2db
systemctl start nagios
systemctl start npcd
systemctl start crond
Let the server run for a while and send us a fresh system profile and we'll check it out. Thanks, Benjamin
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
Re: Nagios is not proccesing
Posted: Wed Jul 29, 2020 2:08 am
by FCC_Nagios_Support
Hello,
We have
[root@a2nagio001p ~]# cat /etc/sysctl.conf
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
kernel.msgmnb = 131072000
kernel.msgmax = 262144000
kernel.msgmni = 512000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
[root@a2nagio001p ~]#
[root@a2nagio001p ~]# mysql -uroot -pnagiosxi -e "show variables like 'max_connections';"
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| max_connections | 151 |
+-----------------+-------+
[root@a2nagio001p ~]#
Re: Nagios is not proccesing
Posted: Wed Jul 29, 2020 2:13 am
by FCC_Nagios_Support
Here comes profile
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.
Re: Nagios is not proccesing
Posted: Wed Jul 29, 2020 3:08 pm
by benjaminsmith
Hi,
Thanks for the system profile. The load is high than normal, and the database log was empty, can post the output of the following command to the ticket?
Code: Select all
tail -n 100 /var/log/mariadb/mariadb.log
Also, I notice the following process running, it's a fork of Nagios Core, what is it for?
Code: Select all
monitor+ 7447 1 0 Jul21 ? 00:03:24 /omd/sites/monitoring/bin/naemon -ud /omd/sites/monitoring/tmp/naemon/naemon.cfg
monitor+ 7449 7447 0 Jul21 ? 00:01:26 /omd/sites/monitoring/bin/naemon --worker /omd/sites/monitoring/var/naemon/naemon.qh
monitor+ 7450 7447 0 Jul21 ? 00:01:26 /omd/sites/monitoring/bin/naemon --worker /omd/sites/monitoring/var/naemon/naemon.qh
monitor+ 7451 7447 0 Jul21 ? 00:01:26 /omd/sites/monitoring/bin/naemon --worker /omd/sites/monitoring/var/naemon/naemon.qh
The overall check load is high.
Code: Select all
Total Hosts: 1400
Total Services: 38978
At around 20,000+ checks will usually recommend to set up an additional XI instance. In order to successfully run a high check load, it will be necessary to implement performance tweaking, a few suggestions:
1. RAMDisk
Generally, at around 10,000 combined hosts and services, consider adding a RAMDisk to the XI server. This is because much of the activity generated by the Nagios server is disk I/O. In order to improve performance, you need to speed up data transfers. You can use the top command to analyze i/o wait to determine the amount of time the CPU is waiting for I/O to complete.
For directions on installing a RAMDisk in Nagios XI, see the following guide:
Utilizing a RAM Disk in Nagios XI
2. 3. Performance and Database Settings
Go to Admin > Performance Setting > Databases and adjust the retention settings to the smallest values you can accept. Also, check the sizes of the tables in the databases. It's not unusual for the log_entries table to become very large resulting in high CPU usage and/or corrupted tables.
See the following guide for directions on repairing crashed tables and truncating the nagiois_logentries and nagios_notifications tables.
Repairing the Nagios XI Database
Maximizing Performance in Nagios XI
3. Offload the Processing of Checks with Mod-Gearman
If the server has more than 20,000 combined hosts and services, consider integrating Mod-Gearman to reduce the impact on the Nagios XI server.
Integrating Mod-Gearman with Nagios XI
Re: Nagios is not proccesing
Posted: Thu Jul 30, 2020 1:57 am
by FCC_Nagios_Support
Many Thanks,
I susspect about system performance features.
Look attached:
log mariadb
top
nproc
cpu
mem
Re: Nagios is not proccesing
Posted: Thu Jul 30, 2020 1:59 am
by FCC_Nagios_Support
RamDisk is equivalent to increase RAM SO?
Re: Nagios is not proccesing
Posted: Thu Jul 30, 2020 5:18 am
by FCC_Nagios_Support
Thanks sir cause your help.
I understood Ramdisk is use part of de RAM for I/O Disk operation
We installed.
I paste the output and I hope all be over.
Thanks Again
Re: Nagios is not proccesing
Posted: Thu Jul 30, 2020 9:29 am
by FCC_Nagios_Support
The impact is heave: Performance no piainitin a yet yes!
Many Thanks
Re: Nagios is not proccesing
Posted: Thu Jul 30, 2020 10:34 am
by benjaminsmith
HI,
Sounds like it's better, did you have any other questions?