Spikes of on-demand checks

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
raulpe
Posts: 28
Joined: Mon Nov 25, 2013 10:44 am

Spikes of on-demand checks

Post by raulpe »

Hello,

I recently setup MRTG to record my Nagios configuration performance. The chart attached shows a one day snapshot of active host checks where the green area represents scheduled checks and the blue line is on-demand checks.

Are the spikes showing on the on-demand checks normal? If not, can this be caused by assigning multiple parents to a host? If normal, why are they happening every hour?

Thank you.
Attachments
Active Hosts Checks
Active Hosts Checks
activeHostsChecks.png (4.04 KiB) Viewed 1220 times
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Spikes of on-demand checks

Post by sreinhardt »

on demand checks are generally caused by a service check failing, and scheduling an immediate host check. Have you recently setup any checks that are failing at a somewhat regular basis?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
raulpe
Posts: 28
Joined: Mon Nov 25, 2013 10:44 am

Re: Spikes of on-demand checks

Post by raulpe »

Are on-demand checks triggered by soft states or hard states only? I have a few services that depend on a file being available but I have given it a lot of time for it to be available before it becomes a hard state. That's the only thing I can imagine generating this number of checks simultaneously.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Spikes of on-demand checks

Post by sreinhardt »

They are triggered by both if I recall correctly, I know they are called on each soft state, but may be used on hard states too, to determine again if the host is down or if the service is down for notification logic. Your case might make sense if those services are returning a soft warning or critical state if the file is not available. You might try creating a temp file or some other way of determining the last time that file was available, and if it is not presently there, check the temp file to see if it is still within a reasonable time range. something like:

Code: Select all

if (file does not exist) {
  if (temp file exists) {
    if (temp file within 2 hours) { //return ok, no file but within range
      echo "File not found, but within time specified"
      exit 0
    }
    else { //return critical, no file and out of range
       echo "File not found, outside of time specified"
       exit 2
    }
  else{ //return critical, file and temp dont exist
     echo "file not found and temp file does not exist"
     exit 2
  }
else{ //file exists, so do normal checks and if everything is ok, set temp file, use the timestamp on file for comparison. 
   ... some code to check file contents...
   touch /tmp/temp-file
}
Just some pseudo code that might work to alleviate some false positives on those file checks. You would also want to shorten the soft state times and number of checks if you are going to do this.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Locked