Page 1 of 3
Still having trouble with npcd...
Posted: Tue Oct 18, 2011 10:57 am
by cwscribner
Hi all.
I recently moved to a new server in hopes that the increase in resources would fix the npcd problem. No dice...its still stopping pretty regularly. I'm just getting repetitive errors in the log saying NPCD returned error code 6. I've tried increasing load threshold, increasing thread amount, and wait time, but its still dying. I'm definitely in need of some help on this one...I'm stumped. I can post a log file if need be but its just 500 lines of the aforementioned error.
Re: Still having trouble with npcd...
Posted: Tue Oct 18, 2011 11:07 am
by mguthrie
Yeah, we've got another user who keeps getting the exact same error. It's going to take some code hunting to try and resolve this one, but I'll definitely be digging into this soon, since it affects production environments. Did you have any luck with the custom event handler to automatically restart the service? If not we might write one up and post it.
Go ahead and post one instance of the error you're getting, just to make sure we're looking for the right thing ; )
Re: Still having trouble with npcd...
Posted: Tue Oct 18, 2011 11:45 am
by cwscribner
Here's an output from npcd.log and npcd.cfg
Re: Still having trouble with npcd...
Posted: Tue Oct 18, 2011 12:14 pm
by cwscribner
And here's something that I just noticed that's different than the other entries.
Code: Select all
[10-18-2011 12:54:51] NPCD: Could not create thread... exiting with error 'Cannot allocate memory'
Re: Still having trouble with npcd...
Posted: Tue Oct 18, 2011 2:45 pm
by mguthrie
Yeah that's the one I get to hunt for. I'm guessing that bug is in the C.... (mguthrie hangs head in dismay)
Re: Still having trouble with npcd...
Posted: Tue Oct 18, 2011 4:49 pm
by mguthrie
Hey just out of curiosity, how close is your system typically running to being maxed out of available memory?
Re: Still having trouble with npcd...
Posted: Tue Oct 18, 2011 6:42 pm
by cwscribner
There's 16G in it and it idles around 8G depending on how much is cached. Its been maxed out a few times. And its common for it to hover around 10-12G depending on what the server is doing. Just for the record, its dedicated to Nagios so there's nothing else running. No GUI of any kind.
Re: Still having trouble with npcd...
Posted: Tue Oct 18, 2011 6:50 pm
by cwscribner
New errors
Code: Select all
[10-18-2011 14:41:44] NPCD: Error while get file list from spooldir (/usr/local/nagios/var/spool/perfdata/) - Cannot allocate memory
[10-18-2011 14:41:44] NPCD: Exiting...
[10-18-2011 14:41:44] NPCD: Daemon ended. PID was '27672'
Re: Still having trouble with npcd...
Posted: Wed Oct 19, 2011 10:20 am
by mguthrie
Thanks for the fresh post
Looks like it's failing at a different place for the same reason. (memory allocation). Lets see if its an issue with system limits setting or if it really is the machine is self is running out of memory.
Edit the /etc/security/limits.conf file in a text editor.
Add the following two lines towards the end of the file:
Code: Select all
@nagios hard stack 20480
@nagios hard msgqueue unlimited
Then restart npcd
Re: Still having trouble with npcd...
Posted: Wed Oct 19, 2011 12:10 pm
by cwscribner
Do you need any kind of output? I made the changes but I'm not sure what I should monitor.