service frequently goes to unknown state
Posted: Wed Mar 21, 2018 8:45 am
Hello Team,
Please help.
We are monitoring the window CPU service using the below counter in VB script. It is configured on almost all the windows servers.
Set objWMIService = GetObject("winmgmts:" _
& "Win32_PerfFormattedData_PerfOS_Processor." _
& "name='_Total'")
cpuused = objWMIService.PercentProcessorTime
In most of the servers, the service fluctuates from OK to unknown for every checks as per the below logs.
Mar 21 04:35:10 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;HARD;3;OK:Processor(_Total)%Processor Time:0
Mar 21 04:46:10 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:00:52 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 05:11:28 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:26:03 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 05:37:16 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:51:57 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 06:02:52 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 06:18:21 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 06:29:39 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Entry in nsclient.ini
check_cpu_mem=cscript.exe /NoLogo scripts\\custom\\check_cpu_mem.vbe $ARG1$ $ARG2$ $ARG3$ $ARG4$
But when i execute it in check command all looks fine.
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:2
|'\Processor(_Total)\% Processor Time'=2;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
Please help.
We are monitoring the window CPU service using the below counter in VB script. It is configured on almost all the windows servers.
Set objWMIService = GetObject("winmgmts:" _
& "Win32_PerfFormattedData_PerfOS_Processor." _
& "name='_Total'")
cpuused = objWMIService.PercentProcessorTime
In most of the servers, the service fluctuates from OK to unknown for every checks as per the below logs.
Mar 21 04:35:10 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;HARD;3;OK:Processor(_Total)%Processor Time:0
Mar 21 04:46:10 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:00:52 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 05:11:28 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:26:03 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 05:37:16 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:51:57 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 06:02:52 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 06:18:21 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 06:29:39 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Entry in nsclient.ini
check_cpu_mem=cscript.exe /NoLogo scripts\\custom\\check_cpu_mem.vbe $ARG1$ $ARG2$ $ARG3$ $ARG4$
But when i execute it in check command all looks fine.
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:2
|'\Processor(_Total)\% Processor Time'=2;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95