nagios dead but subsys locked

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

nagios dead but subsys locked

Post by vinothsethuram »

Hi,

Unexpectedly Nagios went down and failed to monitor my host and services. Log shows as follows

Caught SIGSEGV, shutting down...

When I try to restart the nagios, I got the following info

nagios dead but subsys locked

Could you please let me know reason for above issue and how can we avoid this is in future?
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: nagios dead but subsys locked

Post by abrist »

Unexpected shutdowns may leave lock files. Remove:

Code: Select all

rm /usr/local/nagios/var/nagios.lock
And then restart nagios:

Code: Select all

service nagios restart
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: nagios dead but subsys locked

Post by lmiltchev »

Are you using mk-livestatus? What's the output of the following command?

Code: Select all

tail -30 /usr/local/nagios/var/nagios.log
Be sure to check out our Knowledgebase for helpful articles and solutions!
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: nagios dead but subsys locked

Post by vinothsethuram »

abrist wrote:Unexpected shutdowns may leave lock files. Remove:

Code: Select all

rm /usr/local/nagios/var/nagios.lock
And then restart nagios:

Code: Select all

service nagios restart

I executed above commands. Thank you.
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: nagios dead but subsys locked

Post by vinothsethuram »

lmiltchev wrote:Are you using mk-livestatus? What's the output of the following command?

Code: Select all

tail -30 /usr/local/nagios/var/nagios.log

I executed above commands by changing 30 to 200, but no info about nagios lock or sigserv error. Please help to understand the cause for this downtime.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: nagios dead but subsys locked

Post by slansing »

You need to share the output as requested:

Code: Select all

tail -30 /usr/local/nagios/var/nagios.log
As well as:

Code: Select all

tail -30 /var/log/messages
Though it is possible whatever caused this is no longer present in the last 30 lines of the logs, if it was ever. You would most likely want to open your current /var/log/messages and hunt around at the point in time that this occurred.
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: nagios dead but subsys locked

Post by vinothsethuram »

Code: Select all

Dec 29 23:24:57 nagios nagios: wproc: 'Core Worker 5129' seems to be choked. ret = 528000; bufsize = 826406: errno = 11 (Resource temporarily unavailable)
Dec 29 23:24:57 nagios nagios: wproc: iocache_read() from Core Worker 5129 returned -1: Connection reset by peer
Dec 29 23:24:57 nagios nagios: wproc: Socket to worker Core Worker 5129 broken, removing
Dec 29 23:24:57 nagios nagios: Caught SIGSEGV, shutting down...

Will it help you to analyse the cause?
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: nagios dead but subsys locked

Post by slansing »

If you can reply with the information we have been requesting we can.
vinothsethuram
Posts: 147
Joined: Thu Nov 07, 2013 11:44 am

Re: nagios dead but subsys locked

Post by vinothsethuram »

slansing wrote:If you can reply with the information we have been requesting we can.
you asked me to run the following command.

Code: Select all

tail -30 /var/log/messages
And I got the following message as output which matches the downtime and downtime info.

Code: Select all

Dec 29 23:24:57 nagios nagios: wproc: 'Core Worker 5129' seems to be choked. ret = 528000; bufsize = 826406: errno = 11 (Resource temporarily unavailable)
Dec 29 23:24:57 nagios nagios: wproc: iocache_read() from Core Worker 5129 returned -1: Connection reset by peer
Dec 29 23:24:57 nagios nagios: wproc: Socket to worker Core Worker 5129 broken, removing
Dec 29 23:24:57 nagios nagios: Caught SIGSEGV, shutting down...
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: nagios dead but subsys locked

Post by slansing »

You only have 4 lines in your /var/log/messages log? Are you absolutely sure? We asked for the last 30 lines, not just 4 lines that specifically have to deal with the event..
Locked