Page 1 of 2

monitoring load

Posted: Wed Aug 13, 2014 10:00 am
by srikanth.kallu
Hi just a general question

What does this mean in monitoring load

check_load!-a '-w 15,10,5 -c 30,20,10'!!!!!!

Thanks,
Srikanth.

Re: monitoring load

Posted: Wed Aug 13, 2014 10:08 am
by tmcdonald
That means you are calling the check_load command with the warning thresholds of 15, 10, and 5, and the critical thresholds of 30, 20, and 10. You can read more here:

http://nagios.sourceforge.net/docs/3_0/ ... ml#command
http://nagios.sourceforge.net/docs/3_0/ ... ml#service

Is this a NRPE check?

Re: monitoring load

Posted: Wed Aug 13, 2014 10:17 am
by srikanth.kallu
yes this is nrpe check,

Sorry i still dont understand "w 15,10,5 -c 30,20,10'" except the warning and critical thresholds

My question was for example in this check_nrpe!check_disk!-a '-w 20% -c 10% -p /' it alerts us when / is at 80% used for warning and 90% used as critical

In the same way what does this mean "check_load!-a '-w 15,10,5 -c 30,20,10'!!!!!!"

Re: monitoring load

Posted: Wed Aug 13, 2014 10:25 am
by tmcdonald
It means that the warning threshold is set to:

Load of 15 over 1 minute
Load of 10 over 5 minutes
Load of 5 over 15 minutes

and the critical threshold is set to:

Load of 30 over 1 minute
Load of 20 over 5 minutes
Load of 10 over 15 minutes

Re: monitoring load

Posted: Thu Oct 23, 2014 10:02 am
by srikanth.kallu
What is load average ?

Load average is a gauge of how many processes are on average, concurrently demanding CPU attention.

I see something different on my performance graphs ( attached ) if my number of processes increase then my load average should increase ( however I understand that it also depends on CPU)

but from the graphs I see load average increase when total number of processes are zero.

Let me know if I am missing any logic.

Re: monitoring load

Posted: Thu Oct 23, 2014 11:05 am
by slansing
The above statement is not necessarily true, the check does nothing more than look at the standard 1, 5, and 15 minute load averages as you would get from TOP, if you want to learn about the mechanics of *nix system load metrics, take a look at a post such as this one:

http://blog.scoutapp.com/articles/2009/ ... d-averages

Re: monitoring load

Posted: Thu Oct 23, 2014 11:40 am
by srikanth.kallu
ok top for linux right ? mine is an AIX system do you know what does it check ?

Re: monitoring load

Posted: Thu Oct 23, 2014 11:45 am
by abrist
AIX load average statistics work the same way as they do in linux. In fact, the behavior of the load average metric is pretty uniform across all *nix systems.
srikanth.kallu wrote:but from the graphs I see load average increase when total number of processes are zero.
I think your total processes check is not working correctly. At any given time there should be a number of processes always running.

Re: monitoring load

Posted: Thu Oct 23, 2014 11:55 am
by srikanth.kallu
ok. do you know what command does nagios use to check number of processes on AIX ? and there is no TOP command in AIX

Re: monitoring load

Posted: Thu Oct 23, 2014 1:26 pm
by sreinhardt
Depending on what dependencies you have, check_load will either use the getloadavg() from sys/loadavg.h, attempt to access via proc loadavg, or finally attempts to check some other file paths that I would need to check on specifically as they are not contained directly within the plugin. Check_procs, which would check processes checks via the ps command and the /procs filesystem. In both cases, aside from which additional headers and functions are available, nothing specific is defined for AIX.