Hello Team,
Please help.
We are monitoring the window CPU service using the below counter in VB script. It is configured on almost all the windows servers.
Set objWMIService = GetObject("winmgmts:" _
& "Win32_PerfFormattedData_PerfOS_Processor." _
& "name='_Total'")
cpuused = objWMIService.PercentProcessorTime
In most of the servers, the service fluctuates from OK to unknown for every checks as per the below logs.
Mar 21 04:35:10 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;HARD;3;OK:Processor(_Total)%Processor Time:0
Mar 21 04:46:10 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:00:52 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 05:11:28 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:26:03 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 05:37:16 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 05:51:57 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 06:02:52 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Mar 21 06:18:21 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;OK;SOFT;2;OK:Processor(_Total)%Processor Time:0
Mar 21 06:29:39 saclx127 nagios: SERVICE ALERT: HKDNT877;COUNTER-Processor_Total-Processor-Time;UNKNOWN;SOFT;1;Unable to establish communication with Agent
Entry in nsclient.ini
check_cpu_mem=cscript.exe /NoLogo scripts\\custom\\check_cpu_mem.vbe $ARG1$ $ARG2$ $ARG3$ $ARG4$
But when i execute it in check command all looks fine.
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:2
|'\Processor(_Total)\% Processor Time'=2;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
[gsspmuth@SACLX127 etc]$ /usr/local/nagios/libexec/check_nrpe -H 10.209.41.158 -p 56660 -t 30 -c check_cpu_mem -a CPU 90 95 5
OK:Processor(_Total)%Processor Time:0
|'\Processor(_Total)\% Processor Time'=0;90;95
service frequently goes to unknown state
Re: service frequently goes to unknown state
Error from one of the window server from nsclient.log file,
2018-03-18 09:21:05: e:..\..\..\..\nscp\modules\CheckEventLog\eventlog_wrapper.cpp:31: Failed to close eventlog: 1717: The interface is unknown.
2018-03-18 09:21:05: e:..\..\..\..\nscp\modules\CheckEventLog\eventlog_wrapper.cpp:173: Failed to read eventlog record(0): 6: The handle is invalid.
2018-03-18 09:21:12: e:..\..\..\..\nscp\modules\CheckSystem\PDHCollector.cpp:141: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: No data to return.
2018-03-18 09:21:12: e:..\..\..\..\nscp\modules\CheckSystem\PDHCollector.cpp:141: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: No data to return.
Thx.
2018-03-18 09:21:05: e:..\..\..\..\nscp\modules\CheckEventLog\eventlog_wrapper.cpp:31: Failed to close eventlog: 1717: The interface is unknown.
2018-03-18 09:21:05: e:..\..\..\..\nscp\modules\CheckEventLog\eventlog_wrapper.cpp:173: Failed to read eventlog record(0): 6: The handle is invalid.
2018-03-18 09:21:12: e:..\..\..\..\nscp\modules\CheckSystem\PDHCollector.cpp:141: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: No data to return.
2018-03-18 09:21:12: e:..\..\..\..\nscp\modules\CheckSystem\PDHCollector.cpp:141: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: No data to return.
Thx.
Re: service frequently goes to unknown state
I wonder if you have an issue with a specific version of NSClient++ agent. What is the NSClient++ version that you are currently using?In most of the servers, the service fluctuates from OK to unknown for every checks as per the below logs.
I see the following error in the log:
Code: Select all
2018-03-18 09:21:12: e:..\..\..\..\nscp\modules\CheckSystem\PDHCollector.cpp:141: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: No data to return.
https://forums.nsclient.org/t/performan ... oblem/3522
Can you provide us with a download link to the "check_cpu_mem.vbe" script? You can also rename it with the *.txt extension, and post it on the forum. We will try to test it in-house.
Be sure to check out our Knowledgebase for helpful articles and solutions!