Re: [Nagios-devel] BUG/PATCH: Runaway processes under Linux (and

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] BUG/PATCH: Runaway processes under Linux (and

Post by Guest »

[ apologies if dups are recieved, original sent from the wrong mail account]


bruce wrote:
> On Thu, 27 Apr 2006, Andreas Ericsson wrote:
>
>> bruce wrote:
>
>>> On some systems, a rarer problem shows itself, making the solution to
>>> the Nagios issue somewhat harder. This problem is when a child
>>> process, inheriting the parent's signal handlers, receives a signal
>>> (usually SIGCHLD, sometimes SIGTERM) and then exits, taking out the
>>> parent's lock/pid file. Thus, one no longer knows which process is
>>> the legitimate parent process.
>>
>> If nagios' grandchildren (the ones that popen() commands) receives
>> SIGCHLD from anything but the check it's running something is very,
>> very wrong with the system you're using. Are you perhaps using the old
>> and deprecated NGPT-library?
>
>
> The lock removal instead seems to be occuring with the child process
> created in my_system(), which sometimes stalls at a point before the
> signal handlers get reset (or they don't get reset, my debugging
> statements weren't fine-grained enough). When the parent sends a TERM
> signal to the child when it is in this state (due to timeout), the child
> runs the signal handlers inherited from the parent, removing the lock file.
>

BTW I haven't read your patch or the code in question, so I'm just a**
talking. But from the sounds of it:

Wouldn't it be prudent to use sigprocmask to mask any (and all) signal
which is being 'used' for some purpose in the parent before forking.
After forking, in the child, reset the signal handler to something sane
and then unmask. In the parent, simply unmask after fork and any
'missed' signals will be delivered.

That way, there's no race condition...

David









This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked