Still having trouble with npcd...

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Still having trouble with npcd...

Post by cwscribner »

Hi all.

I recently moved to a new server in hopes that the increase in resources would fix the npcd problem. No dice...its still stopping pretty regularly. I'm just getting repetitive errors in the log saying NPCD returned error code 6. I've tried increasing load threshold, increasing thread amount, and wait time, but its still dying. I'm definitely in need of some help on this one...I'm stumped. I can post a log file if need be but its just 500 lines of the aforementioned error.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Still having trouble with npcd...

Post by mguthrie »

Yeah, we've got another user who keeps getting the exact same error. It's going to take some code hunting to try and resolve this one, but I'll definitely be digging into this soon, since it affects production environments. Did you have any luck with the custom event handler to automatically restart the service? If not we might write one up and post it.

Go ahead and post one instance of the error you're getting, just to make sure we're looking for the right thing ; )
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Still having trouble with npcd...

Post by cwscribner »

Here's an output from npcd.log and npcd.cfg
You do not have the required permissions to view the files attached to this post.
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Still having trouble with npcd...

Post by cwscribner »

And here's something that I just noticed that's different than the other entries.

Code: Select all

[10-18-2011 12:54:51] NPCD: Could not create thread... exiting with error 'Cannot allocate memory'
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Still having trouble with npcd...

Post by mguthrie »

Yeah that's the one I get to hunt for. I'm guessing that bug is in the C.... (mguthrie hangs head in dismay)
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Still having trouble with npcd...

Post by mguthrie »

Hey just out of curiosity, how close is your system typically running to being maxed out of available memory?
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Still having trouble with npcd...

Post by cwscribner »

There's 16G in it and it idles around 8G depending on how much is cached. Its been maxed out a few times. And its common for it to hover around 10-12G depending on what the server is doing. Just for the record, its dedicated to Nagios so there's nothing else running. No GUI of any kind.
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Still having trouble with npcd...

Post by cwscribner »

New errors :D

Code: Select all

[10-18-2011 14:41:44] NPCD: Error while get file list from spooldir (/usr/local/nagios/var/spool/perfdata/) - Cannot allocate memory
[10-18-2011 14:41:44] NPCD: Exiting...
[10-18-2011 14:41:44] NPCD: Daemon ended. PID was '27672'
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Still having trouble with npcd...

Post by mguthrie »

Thanks for the fresh post ;)

Looks like it's failing at a different place for the same reason. (memory allocation). Lets see if its an issue with system limits setting or if it really is the machine is self is running out of memory.

Edit the /etc/security/limits.conf file in a text editor.

Add the following two lines towards the end of the file:

Code: Select all

@nagios         hard    stack           20480
@nagios         hard    msgqueue        unlimited
Then restart npcd

Code: Select all

service npcd restart
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Still having trouble with npcd...

Post by cwscribner »

Do you need any kind of output? I made the changes but I'm not sure what I should monitor.
Locked