Page 2 of 4
Re: nagios_logentries causing problems
Posted: Thu Sep 08, 2016 10:43 am
by Box293
chicjo01 wrote:
So my guess would be after all is said and done with both windows and linux.
Hosts: 2192
Services: 40000
Total: 42192 ballpark
Generally speaking Nagios XI has bottleneck issues once you go past 20,000 objects. If your system is running out of RAM or CPU then it can cause lots of issues.
We are currently in the progress of developing some KB articles to explain the exact details of where the specific bottlenecks can occur and how to help, this might not be available for a couple of months yet.
Honestly I would be looking at implementing 3 x Nagios XI servers to break things up, it will give you a more stable and reliable monitoring solution.
Re: nagios_logentries causing problems
Posted: Thu Sep 08, 2016 11:55 am
by chicjo01
Is this bottleneck related to only active checks or does it also include passive checks or if you offload the service check to remote servers using DNX or something like that? Since you are currently writing up the KB Articles, this means you may have solutions already for these bottlenecks. If you do, can you let me know what these are?
Going back to the original problem. Do you have any suggestions on what can be done to correct this or know why it is happening?
Re: nagios_logentries causing problems
Posted: Thu Sep 08, 2016 1:03 pm
by ssax
You are hitting a Core/NDOUtils bug that has been fixed in later versions, I talked with our C developer and he recommends that you upgrade to the just released versions:
Nagios Core - 4.2.1:
https://www.nagios.org/downloads/nagios-core/
NDOUtils - 2.1.1:
https://sourceforge.net/projects/nagios ... ils-2.1.1/
They
BOTH need to be upgraded to work properly, if you need help with this it would best be handled in a ticket, please send in an email to
[email protected] with a descriptive subject and detailed body with a link back to this thread and we can go from there.
Thank you
Re: nagios_logentries causing problems
Posted: Thu Sep 08, 2016 1:31 pm
by chicjo01
Thank you for the information. Going back the original problem, what additional information do you need from me? or are you saying the problem I am having with "Unable to run check for service" is because of this bottleneck?
Re: nagios_logentries causing problems
Posted: Thu Sep 08, 2016 4:46 pm
by ssax
It's the fix for the multiple kernel message queues, which if you running the previous commands fixed (did they?), would fix it permanently.
Re: nagios_logentries causing problems
Posted: Fri Sep 09, 2016 12:25 pm
by chicjo01
The command given before, did not fix the issue. If I upgrade the specific components (Nagios Core / Ndoutils), will that cause my Nagios XI to have problem when I need to update that?
Code: Select all
service nagios stop
killall -9 nagios
service ndo2db stop
service mysqld restart
rm -rf /usr/local/nagios/var/rw/nagios.cmd
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
service ndo2db start
service nagios start
Re: nagios_logentries causing problems
Posted: Fri Sep 09, 2016 1:56 pm
by tgriep
Upgrading Core and Ndoutils should helpout in fixing the kernel message queue issue you are having.
If you do upgrade the XI server software, it would downgrade core and ndoutils.
There are other fixes you can try and if you post the following file, I can give you some suggestions.
Re: nagios_logentries causing problems
Posted: Fri Sep 09, 2016 1:57 pm
by ssax
tgriep is correct, you would need to re-upgrade Core/NDOUtils again after every XI upgrade otherwise it will get reverted.
Re: nagios_logentries causing problems
Posted: Fri Sep 09, 2016 2:57 pm
by chicjo01
I will work to get the two components updated. Will have to submit a change control for it. Are there any plans to integrate the components into NagiosXI?
Code: Select all
cat /etc/sysctl.conf
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv6.conf.default.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.ip_forward = 0
kernel.exec-shield=1
kernel.randomize_va_space = 2
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.secure_redirects = 0
kernel.msgmnb = 131072000
kernel.msgmax = 131072000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv6.conf.all.accept_ra = 0
net.ipv6.conf.default.accept_ra = 0
net.ipv6.conf.all.accept_redirects = 0
Re: nagios_logentries causing problems
Posted: Mon Sep 12, 2016 9:07 am
by ssax
I assume so, after they've been thoroughly tested, it's entirely up to the developers on when they are updated though.