Page 1 of 2
NSCA stops working sometimes
Posted: Wed Mar 19, 2014 3:18 am
by WillemDH
Hello,
I've had 3-4 times, about one time a month that Nagios XI stops handling NSCA passive checks coming in. When I restart xinetd, the problem is solved, but I'd like to find out what's causing this and how to prevent it.
Which logfiles should I go through to search for a casue? The last event I received was 2014-03-18 22:23:59. At 2014-03-19 00:00:00 I see a lot of informational events like "CURRENT SERVICE STATE: DUMMYSERVERNAME;EVT_Application;OK;HARD;1;eventlog found no records" After 00:00, I did not receive any events anymore.. What is happening at 00:00?
Grtz
Willem
Re: NSCA stops working sometimes
Posted: Wed Mar 19, 2014 11:56 am
by slansing
It's possible that at those times you are getting hammered particularily hard by returned data, I would recommend kicking up the following limit in /etc/xinetd.conf:
Save, and restart xinetd:
That very well could take care of your troubles.
Re: NSCA stops working sometimes
Posted: Fri Mar 21, 2014 8:57 am
by WillemDH
Ok, changed per_source from 10 to 20. I'll let you know in a week if the problem returns. Should 20 be sufficient? Can I monitor this?
Re: NSCA stops working sometimes
Posted: Fri Mar 21, 2014 9:14 am
by slansing
I would think so, but it is highly dependent on the data that is being returned. How many checks do you have it sending back to Nagios?
Re: NSCA stops working sometimes
Posted: Fri Mar 21, 2014 10:11 am
by WillemDH
Well, As all our Windows servers have two passive services, one for Application events and one for System events, we have about (2 * 300) 600 services sending events to Nagios through NSCA.
Re: NSCA stops working sometimes
Posted: Fri Mar 21, 2014 1:31 pm
by tmcdonald
Again it will depend on their frequency, but if we assume the standard 5-minute interval (300 seconds) and evenly-distributed checks then we are looking at 2 checks per second. Of course this is not realistic, so you might have no checks one second and 15 the next. So 20 might be good, or you might have to bump it more. Just gotta wait and see.
Re: NSCA stops working sometimes
Posted: Fri Mar 21, 2014 1:33 pm
by lmiltchev
I am not sure whether or not "per_source = 20" would be sufficient. You can use "per_source = UNLIMITED", even though according to some, it's always better to use a "fixed" value than "UNLIMITED". Have you had any issues since you modified the "/etc/xinetd.conf"?
Re: NSCA stops working sometimes
Posted: Fri Mar 21, 2014 3:05 pm
by WillemDH
Well, I'll update this thread in a week or so and will let you know.
Have a fine weekend!
Re: NSCA stops working sometimes
Posted: Mon Mar 24, 2014 9:01 am
by tmcdonald
Great. Just going to post so it stays off our radar.
Re: NSCA stops working sometimes
Posted: Tue Jul 01, 2014 5:31 am
by WillemDH
Just had the same problem. Seems it worked for more then a month. I could live with that, if I at least could get an aert for it. Is there some way to monitor this. No errors are shown when Nagios stops receiving passive events. Any tips how I could monitor this? It seems the nsca service was still running.