Checks always falling behind
Checks always falling behind
I'm having a problem with scheduled checks always falling behind. I've seen it as far as an hour behind. This is probably a result of the number of checks (~3K feeding another ~10K passive services), the fact that most of them are SNMP walks, and that I am using NDOutils to feed it all into MySQL. I found that NDO is hitting the ceiling of some kernel params for messaging so I cranked those up and those warnings seem to have stopped. I've disabled host checking since I only care about services. I restarted nagios and now things seem to be hovering around a couple minutes late - I can deal with that.
Now, I have a check_nagios running via cron to make sure that things are flowing at all, but is there a check that I can do to check how far behind the scheduling queue is running?
Now, I have a check_nagios running via cron to make sure that things are flowing at all, but is there a check that I can do to check how far behind the scheduling queue is running?
Re: Checks always falling behind
You will probably want to just pull information from the scheduliong queue cgi and grab the topmost table entry for next check time:
I came up with this one liner to get the time of the next check:
Obviously, replace <password> and <nagios server ip> with their actual values for your environment. At this point you can compare the date reported to the current date of the nagios system and report it through a plugin script right to the XI interface:
That was fun.
Code: Select all
http://<nagios server ip>/nagios/cgi-bin/extinfo.cgi?type=7Code: Select all
curl -s -u nagiosadmin:<password> http://<nagios server ip>/nagios/cgi-bin/extinfo.cgi?type=7 | grep -m 2 "<TR CLASS=" | tail -n1 | awk 'BEGIN { FS = "<TD CLASS=\047queueOdd\047>|<TD CLASS=\047queueEven\047>" } ; { print $4 }' | sed 's/<.*//'Code: Select all
#!/bin/bash
# Get time/date from topmost entry in the schedule queue for the next check. Returns 'CCYY-MM-DD hh:mm:ss'.
NEXT=$(curl -s -u nagiosadmin:<password> http://<nagios server ip>/nagios/cgi-bin/extinfo.cgi?type=7 | grep -m 2 "<TR CLASS=" | tail -n1 | awk 'BEGIN { FS = "<TD CLASS=\047queueOdd\047>|<TD CLASS=\047queueEven\047>" } ; { print $4 }' | sed 's/<.*//'| awk 'BEGIN { FS = " |-"};{ print $3,$1,$2,$4 }' | sed 's/ /-/g' | sed 's/-/ /g3')
# Converts date time above to unix time.
NEXTUT=$(date -d "$NEXT" +%s)
# Get current unix time
CURRENT=$(date +%s)
# Subtract current time from next check time
OFFSET=$(($NEXTUT - $CURRENT))
# Echo offset string for nagios status data.
echo "The scheduler is currently Offset by $OFFSET seconds | offset=$OFFSET"
# Exit with 0 so that Nagios shows 'OK'
exit 0
Last edited by abrist on Fri Jul 19, 2013 11:12 am, edited 2 times in total.
Reason: forgot the exit code, perfdata, and some comments . . .
Reason: forgot the exit code, perfdata, and some comments . . .
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Checks always falling behind
I'll try that. However, even though I have disabled host checks I still see them in the scheduling queue. Does it still queue them and only check if they are enabled when it tries to run the check?
Re: Checks always falling behind
If you deactivated the check via the CCM, it should get removed from the queue.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Checks always falling behind
I disabled host checks via the web UI and also set execute_host_checks to 0 in nagios.cfg. Even after a restart, host checks still show in the queue.lmiltchev wrote:If you deactivated the check via the CCM, it should get removed from the queue.
Re: Checks always falling behind
You may need to flush retention.dat:
Code: Select all
service nagios stop
rm /usr/local/nagios/var/retention.dat
service nagios startFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Checks always falling behind
I mean new host checks are in the queue constantly. Not old ones that would be retained. It is still actively scheduling host checks.
Re: Checks always falling behind
Is there a chance you have multiple nagios parent processes running?
Code: Select all
service nagios stop
ps -aef | grep nagios.cfg
killall nagios
service nagios startFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Checks always falling behind
Are the checks running as well as queuing, or just queuing?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.