Checks always falling behind

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
grimm26
Posts: 36
Joined: Wed Dec 12, 2012 1:57 pm

Checks always falling behind

Post by grimm26 »

I'm having a problem with scheduled checks always falling behind. I've seen it as far as an hour behind. This is probably a result of the number of checks (~3K feeding another ~10K passive services), the fact that most of them are SNMP walks, and that I am using NDOutils to feed it all into MySQL. I found that NDO is hitting the ceiling of some kernel params for messaging so I cranked those up and those warnings seem to have stopped. I've disabled host checking since I only care about services. I restarted nagios and now things seem to be hovering around a couple minutes late - I can deal with that.

Now, I have a check_nagios running via cron to make sure that things are flowing at all, but is there a check that I can do to check how far behind the scheduling queue is running?
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Checks always falling behind

Post by abrist »

You will probably want to just pull information from the scheduliong queue cgi and grab the topmost table entry for next check time:

Code: Select all

http://<nagios server ip>/nagios/cgi-bin/extinfo.cgi?type=7
I came up with this one liner to get the time of the next check:

Code: Select all

curl -s -u nagiosadmin:<password> http://<nagios server ip>/nagios/cgi-bin/extinfo.cgi?type=7 | grep -m 2 "<TR CLASS=" | tail -n1 | awk 'BEGIN { FS = "<TD CLASS=\047queueOdd\047>|<TD CLASS=\047queueEven\047>" } ; { print $4 }' | sed 's/<.*//'
Obviously, replace <password> and <nagios server ip> with their actual values for your environment. At this point you can compare the date reported to the current date of the nagios system and report it through a plugin script right to the XI interface:

Code: Select all

#!/bin/bash

# Get time/date from topmost entry in the schedule queue for the next check.  Returns 'CCYY-MM-DD hh:mm:ss'.  
NEXT=$(curl -s -u nagiosadmin:<password> http://<nagios server ip>/nagios/cgi-bin/extinfo.cgi?type=7 | grep -m 2 "<TR CLASS=" | tail -n1 | awk 'BEGIN { FS = "<TD CLASS=\047queueOdd\047>|<TD CLASS=\047queueEven\047>" } ; { print $4 }' | sed 's/<.*//'| awk 'BEGIN { FS = " |-"};{ print $3,$1,$2,$4 }' | sed 's/ /-/g' | sed 's/-/ /g3')

# Converts date time above to unix time.
NEXTUT=$(date -d "$NEXT" +%s)

# Get current unix time
CURRENT=$(date +%s)

# Subtract current time from next check time
OFFSET=$(($NEXTUT - $CURRENT))

# Echo offset string for nagios status data.
echo "The scheduler is currently Offset by $OFFSET seconds | offset=$OFFSET"

# Exit with 0 so that Nagios shows 'OK'  
exit 0
That was fun.
Last edited by abrist on Fri Jul 19, 2013 11:12 am, edited 2 times in total.
Reason: forgot the exit code, perfdata, and some comments . . .
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
grimm26
Posts: 36
Joined: Wed Dec 12, 2012 1:57 pm

Re: Checks always falling behind

Post by grimm26 »

I'll try that. However, even though I have disabled host checks I still see them in the scheduling queue. Does it still queue them and only check if they are enabled when it tries to run the check?
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Checks always falling behind

Post by lmiltchev »

If you deactivated the check via the CCM, it should get removed from the queue.
Be sure to check out our Knowledgebase for helpful articles and solutions!
grimm26
Posts: 36
Joined: Wed Dec 12, 2012 1:57 pm

Re: Checks always falling behind

Post by grimm26 »

lmiltchev wrote:If you deactivated the check via the CCM, it should get removed from the queue.
I disabled host checks via the web UI and also set execute_host_checks to 0 in nagios.cfg. Even after a restart, host checks still show in the queue.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Checks always falling behind

Post by abrist »

You may need to flush retention.dat:

Code: Select all

service nagios stop
rm /usr/local/nagios/var/retention.dat
service nagios start
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
grimm26
Posts: 36
Joined: Wed Dec 12, 2012 1:57 pm

Re: Checks always falling behind

Post by grimm26 »

I mean new host checks are in the queue constantly. Not old ones that would be retained. It is still actively scheduling host checks.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Checks always falling behind

Post by abrist »

Is there a chance you have multiple nagios parent processes running?

Code: Select all

service nagios stop
ps -aef | grep nagios.cfg
killall nagios
service nagios start
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
grimm26
Posts: 36
Joined: Wed Dec 12, 2012 1:57 pm

Re: Checks always falling behind

Post by grimm26 »

Nope.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Checks always falling behind

Post by abrist »

Are the checks running as well as queuing, or just queuing?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Locked