NSCA stops working sometimes

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

NSCA stops working sometimes

Post by WillemDH »

Hello,

I've had 3-4 times, about one time a month that Nagios XI stops handling NSCA passive checks coming in. When I restart xinetd, the problem is solved, but I'd like to find out what's causing this and how to prevent it.

Which logfiles should I go through to search for a casue? The last event I received was 2014-03-18 22:23:59. At 2014-03-19 00:00:00 I see a lot of informational events like "CURRENT SERVICE STATE: DUMMYSERVERNAME;EVT_Application;OK;HARD;1;eventlog found no records" After 00:00, I did not receive any events anymore.. What is happening at 00:00?

Grtz

Willem
Nagios XI 5.8.1
https://outsideit.net
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NSCA stops working sometimes

Post by slansing »

It's possible that at those times you are getting hammered particularily hard by returned data, I would recommend kicking up the following limit in /etc/xinetd.conf:

Code: Select all

        per_source      = 
Save, and restart xinetd:

Code: Select all

service xinetd restart
That very well could take care of your troubles.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: NSCA stops working sometimes

Post by WillemDH »

Ok, changed per_source from 10 to 20. I'll let you know in a week if the problem returns. Should 20 be sufficient? Can I monitor this?
Nagios XI 5.8.1
https://outsideit.net
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NSCA stops working sometimes

Post by slansing »

I would think so, but it is highly dependent on the data that is being returned. How many checks do you have it sending back to Nagios?
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: NSCA stops working sometimes

Post by WillemDH »

Well, As all our Windows servers have two passive services, one for Application events and one for System events, we have about (2 * 300) 600 services sending events to Nagios through NSCA.
Nagios XI 5.8.1
https://outsideit.net
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: NSCA stops working sometimes

Post by tmcdonald »

Again it will depend on their frequency, but if we assume the standard 5-minute interval (300 seconds) and evenly-distributed checks then we are looking at 2 checks per second. Of course this is not realistic, so you might have no checks one second and 15 the next. So 20 might be good, or you might have to bump it more. Just gotta wait and see.
Former Nagios employee
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: NSCA stops working sometimes

Post by lmiltchev »

I am not sure whether or not "per_source = 20" would be sufficient. You can use "per_source = UNLIMITED", even though according to some, it's always better to use a "fixed" value than "UNLIMITED". Have you had any issues since you modified the "/etc/xinetd.conf"?
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: NSCA stops working sometimes

Post by WillemDH »

Well, I'll update this thread in a week or so and will let you know.

Have a fine weekend!
Nagios XI 5.8.1
https://outsideit.net
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: NSCA stops working sometimes

Post by tmcdonald »

Great. Just going to post so it stays off our radar.
Former Nagios employee
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: NSCA stops working sometimes

Post by WillemDH »

Just had the same problem. Seems it worked for more then a month. I could live with that, if I at least could get an aert for it. Is there some way to monitor this. No errors are shown when Nagios stops receiving passive events. Any tips how I could monitor this? It seems the nsca service was still running.
Nagios XI 5.8.1
https://outsideit.net
Locked