Lots of zombie recurringdowntime processes

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
vAJ
Posts: 456
Joined: Thu Nov 08, 2012 5:09 pm
Location: Austin, TX

Lots of zombie recurringdowntime processes

Post by vAJ »

2014r2.6 running on a VM. Seeing this for the past few days: PROCS CRITICAL: 430 processes with STATE = RSZDT

Code: Select all

ps -ef|grep recurring|wc -l
yeilds 275lines of this

Code: Select all

nagios    6736  6735  0 May12 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios    7285  7257  0 May13 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios    7289  7285  0 May13 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios    7361  7329  0 May14 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios    7366  7361  0 May14 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios    7672  7648  0 May11 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios    7673  7672  0 May11 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios    7758  7723  0 May09 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios    7761  7758  0 May09 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios    8863  8837  0 May09 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios    8864  8863  0 May09 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios    9107  9081  0 May11 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios    9110  9107  0 May11 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   10027  9996  0 May09 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   10033 10027  0 May09 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   10307 10277  0 May14 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   10313 10307  0 May14 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   10740 10718  0 May13 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   10741 10740  0 May13 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   11055 11030  0 May12 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   11057 11055  0 May12 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   11293 11269  0 May10 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   11294 11293  0 May10 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   11790 10381  0 May13 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   11791 11790  0 May13 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   11935 11913  0 May11 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   11936 11935  0 May11 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   12750 12730  0 May10 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   12751 12750  0 May10 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   13703 13683  0 May13 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   13704 13703  0 May13 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   14102 14068  0 May10 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   14105 14102  0 May10 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   14166 14141  0 May13 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   14170 14166  0 May13 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   14866 14843  0 May14 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   14871 14866  0 May14 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   14889 14867  0 May13 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   14890 14889  0 May13 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   15152 15125  0 May12 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   15154 15152  0 May12 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   15354 15332  0 May10 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   15357 15354  0 May10 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   16231 16209  0 May12 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   16233 16231  0 May12 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   16604 16574  0 May10 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   16609 16604  0 May10 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
nagios   17791 17768  0 May14 ?        00:00:00 /bin/sh -c /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
nagios   17793 17791  0 May14 ?        00:00:00 /usr/bin/perl /usr/local/nagiosxi/cron/recurringdowntime.pl
Any idea on how this went sideways? Other than restarting the VM, thoughts on best (cleanest) way to clear this up?
Andrew J. - Do you even grok?
vAJ
Posts: 456
Joined: Thu Nov 08, 2012 5:09 pm
Location: Austin, TX

Re: Lots of zombie recurringdowntime processes

Post by vAJ »

Found that a user had some old recurring schedules for hosts that no longer exist.

Removed those and now waiting to see if the zombies drop away.
Andrew J. - Do you even grok?
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Lots of zombie recurringdowntime processes

Post by lmiltchev »

Removed those and now waiting to see if the zombies drop away.
Let us know how it went. I will keep this topic open for a while.
Be sure to check out our Knowledgebase for helpful articles and solutions!
vAJ
Posts: 456
Joined: Thu Nov 08, 2012 5:09 pm
Location: Austin, TX

Re: Lots of zombie recurringdowntime processes

Post by vAJ »

Yeah, no change at the moment. Other than just rebooting the system, any thoughts on remediation?
Andrew J. - Do you even grok?
vAJ
Posts: 456
Joined: Thu Nov 08, 2012 5:09 pm
Location: Austin, TX

Re: Lots of zombie recurringdowntime processes

Post by vAJ »

Reboot cleared. Go ahead and close. If I see this instance zombie more over the week, I'll let you know.
Andrew J. - Do you even grok?
Locked