nagios_logentries causing problems

Post by **Box293** » Thu Sep 08, 2016 10:43 am

chicjo01 wrote: So my guess would be after all is said and done with both windows and linux.

Hosts: 2192
Services: 40000
Total: 42192 ballpark

Generally speaking Nagios XI has bottleneck issues once you go past 20,000 objects. If your system is running out of RAM or CPU then it can cause lots of issues.

We are currently in the progress of developing some KB articles to explain the exact details of where the specific bottlenecks can occur and how to help, this might not be available for a couple of months yet.

Honestly I would be looking at implementing 3 x Nagios XI servers to break things up, it will give you a more stable and reliable monitoring solution.

chicjo01 · Post by **chicjo01** » Thu Sep 08, 2016 11:55 am

Is this bottleneck related to only active checks or does it also include passive checks or if you offload the service check to remote servers using DNX or something like that? Since you are currently writing up the KB Articles, this means you may have solutions already for these bottlenecks. If you do, can you let me know what these are?

Going back to the original problem. Do you have any suggestions on what can be done to correct this or know why it is happening?

ssax · Post by **ssax** » Thu Sep 08, 2016 1:03 pm

You are hitting a Core/NDOUtils bug that has been fixed in later versions, I talked with our C developer and he recommends that you upgrade to the just released versions:

Nagios Core - 4.2.1:

https://www.nagios.org/downloads/nagios-core/

NDOUtils - 2.1.1:

https://sourceforge.net/projects/nagios ... ils-2.1.1/

They BOTH need to be upgraded to work properly, if you need help with this it would best be handled in a ticket, please send in an email to [email protected] with a descriptive subject and detailed body with a link back to this thread and we can go from there.

Thank you

chicjo01 · Post by **chicjo01** » Thu Sep 08, 2016 1:31 pm

Thank you for the information. Going back the original problem, what additional information do you need from me? or are you saying the problem I am having with "Unable to run check for service" is because of this bottleneck?

ssax · Post by **ssax** » Thu Sep 08, 2016 4:46 pm

It's the fix for the multiple kernel message queues, which if you running the previous commands fixed (did they?), would fix it permanently.

chicjo01 · Post by **chicjo01** » Fri Sep 09, 2016 12:25 pm

The command given before, did not fix the issue. If I upgrade the specific components (Nagios Core / Ndoutils), will that cause my Nagios XI to have problem when I need to update that?

Code: Select all

service nagios stop
killall -9 nagios
service ndo2db stop
service mysqld restart
rm -rf /usr/local/nagios/var/rw/nagios.cmd
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
service ndo2db start
service nagios start

Post by **tgriep** » Fri Sep 09, 2016 1:56 pm

Upgrading Core and Ndoutils should helpout in fixing the kernel message queue issue you are having.
If you do upgrade the XI server software, it would downgrade core and ndoutils.
There are other fixes you can try and if you post the following file, I can give you some suggestions.

Code: Select all

/etc/sysctl.conf

ssax · Post by **ssax** » Fri Sep 09, 2016 1:57 pm

tgriep is correct, you would need to re-upgrade Core/NDOUtils again after every XI upgrade otherwise it will get reverted.

chicjo01 · Post by **chicjo01** » Fri Sep 09, 2016 2:57 pm

I will work to get the two components updated. Will have to submit a change control for it. Are there any plans to integrate the components into NagiosXI?

Code: Select all

cat /etc/sysctl.conf
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv6.conf.default.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.ip_forward = 0
kernel.exec-shield=1
kernel.randomize_va_space = 2
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.secure_redirects = 0
kernel.msgmnb = 131072000
kernel.msgmax = 131072000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv6.conf.all.accept_ra = 0
net.ipv6.conf.default.accept_ra = 0
net.ipv6.conf.all.accept_redirects = 0

ssax · Post by **ssax** » Mon Sep 12, 2016 9:07 am

I assume so, after they've been thoroughly tested, it's entirely up to the developers on when they are updated though.

Nagios Support Forum

nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems

Re: nagios_logentries causing problems