This is regarding the usage of check_load for linux servers..
Code: Select all
[root@nagxi libexec]# ./check_load --help
check_load v1991 (nagios-plugins 1.4.13)
Copyright (c) 1999 Felipe Gustavo de Almeida <galmeida@linux.ime.usp.br>
Copyright (c) 1999-2007 Nagios Plugin Development Team
<nagiosplug-devel@lists.sourceforge.net>
This plugin tests the current system load average.
Usage:check_load [-r] -w WLOAD1,WLOAD5,WLOAD15 -c CLOAD1,CLOAD5,CLOAD15
Options:
-h, --help
Print detailed help screen
-V, --version
Print version information
-w, --warning=WLOAD1,WLOAD5,WLOAD15
Exit with WARNING status if load average exceeds WLOADn
-c, --critical=CLOAD1,CLOAD5,CLOAD15
Exit with CRITICAL status if load average exceed CLOADn
the load average format is the same used by "uptime" and "w"
-r, --percpu
Divide the load averages by the number of CPUs (when possible)
when i give three different usages , it gives the same output
Code: Select all
[root@nagxi libexec]# ./check_load -w 5 -c 7
WARNING - load average: 5.90, 2.96, 2.36|load1=5.900;5.000;7.000;0; load5=2.960;5.000;7.000;0; load15=2.360;5.000;7.000;0;
[root@nagxi libexec]# ./check_load -w 15,10,5 -c 30,20,10
OK - load average: 5.51, 2.93, 2.35|load1=5.510;15.000;30.000;0; load5=2.930;10.000;20.000;0; load15=2.350;5.000;10.000;0;
[root@nagxi libexec]# ./check_load -w 15 -c 30
OK - load average: 5.51, 2.93, 2.35|load1=5.510;15.000;30.000;0; load5=2.930;15.000;30.000;0; load15=2.350;15.000;30.000;0;
[root@nagxi libexec]# w
15:58:17 up 192 days, 3:21, 3 users, load average: 5.15, 2.89, 2.34
All are giving the same outputs.
This server is having 8 CPUs.
The question is,
How is that are the above commands are giving the same output..
is it like giving 5min and 15 min avrg load thershold is optional..?
Now comes my actual question,
If have 8 cores,
should i give the arguments as
check_load -w 5 -c 7 for warning as 70% and critical as 90 % approximately..(or) are those like 50% and 70%...5 and 7..
are the above arguments is the calculation of number of cores with required % of thersholds.
few websites are saying that it is load and its not the number of cpu and %s.. for some processes/aplication servers if the cpu is above 250% is also OK / warning..
can someone please explain me how to use and understand the arguments for cpu_load in a single digits intead of 3 avrgs, as most of my configs are in the single digit formats..
and how to calcultate them if the a server is having 16 cores of cpu..
usage example for 1 server which is having 16 cores of processors
Code: Select all
define service {
host_name host1
service_description Load Average
use xiwizard_generic_service
servicegroups CPU,UnixCPU
check_command check_nrpe!check_load!-a '-w 7 -c 9'
max_check_attempts 3
check_interval 15
retry_interval 1
check_period 24x7
notification_interval 0
notification_period 24x7
contact_groups msatoc,unixserveradmin
_xiwizard nrpe
register 1
}
thanx