NCPA Passive windowscounters Failures

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
onegative
Posts: 173
Joined: Tue Feb 17, 2015 12:06 pm

NCPA Passive windowscounters Failures

Post by onegative »

G 'Day All,

I just spent the last two days going through an exhaustive list of perfmon counters on Windows Server 2012 R2 with MSSQL Server loaded. Running the latest GA release of NCPA Agent 2.1.6 but have found out of the 300+ examples the following neither work from the NCPA Listener UI or from Passive configurations. I would appreciate someone else taking the following examples and confirming my suspicions that there is something amiss with the parser that translates between the input and the actual perfmon calls.

I have confirmed that all the objects in questions are present and can be queried using perfmon.exe

If anyone could help it would be greatly appreciated...just looking for confirmation before I submit a git bug.

I have attached two files to help replicate/speed up the process...you would just need to create a HostGroup called PerfCounters, Import the perfimport.txt, add a like host into the HostGroup and then add the perfCounters.cfg to the ncpa.cfg.d directory and restart the ncpa_passive service.
perfImport.txt
perfCounters.cfg
Many thanks,
Danny
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NCPA Passive windowscounters Failures

Post by scottwilkerson »

onegative wrote: I would appreciate someone else taking the following examples and confirming my suspicions that there is something amiss with the parser that translates between the input and the actual perfmon calls.
For clarity, can you specify what is amiss that you are seeing?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
onegative
Posts: 173
Joined: Tue Feb 17, 2015 12:06 pm

Re: NCPA Passive windowscounters Failures

Post by onegative »

The results from the queries show "0 c" for the returned value if defined as a check otherwise the error message indicates the message "Error: The specified object was not found on the computer."

But I can use perfmon.exe and clearly see the object and its value...so it is there but for whatever reason it cannot be queried from the GUI or passive agent.

Here is an example counter that fails from the GUI and from the passive configuration.
/LogicalDisk(_Total)/Avg. Disk Bytes/Read

Attached are the results using the GUI.
windowscounterResults.png
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NCPA Passive windowscounters Failures

Post by scottwilkerson »

Looking back through the items you have, the (_Total) items are not valid, and I think all are replaced with (*).

e.g.

Code: Select all

/Processor(*)/% User Time
If you hover over the ? next to Windows Counter you will see you can get the while list by running the following from Windows cmd.exe

Code: Select all

typeperf.exe -q
Or to put them all in a file

Code: Select all

typeperf.exe -q > c:\tmp\perf.txt
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
onegative
Posts: 173
Joined: Tue Feb 17, 2015 12:06 pm

Re: NCPA Passive windowscounters Failures

Post by onegative »

@scottwilkerson

I assume you never actually tried what you are suggesting...I spent a lot of time actually testing this and found that some instances require (*) where others require (_Total) in their instances,,,

But per your suggestion /Processor(*)/% User Time the following occurs:

counters.cfg:
%HOSTNAME%|Perfmon Processor Total Percent User Time = /windowscounters/Processor(*)/% User Time?check=true

Code: Select all

2019-03-06 11:25:43,887:INFO:ncpacheck:Running check: /windowscounters/Processor(*)/% User Time?check=true
2019-03-06 11:25:43,980:ERROR:windowscounters:(-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.')
Traceback (most recent call last):
  File "C:\ncpa\agent\listener\windowscounters.py", line 43, in counter_method
  File "C:\ncpa\agent\listener\windowscounters.py", line 79, in get_counter_val
error: (-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.')
Please observe that perfmon.exe actually shows the instance as _Total which works and colects data as the following screen shot shows...

Thanks for your help though as I need someone to replicate this so I know I am not doing something stupid...but again I have over 300+ counters working and tested...but this small few would not...not sure why. That is why I am trying to get a validation of the issues I am seeing. I know it is mundane but I believe beneficial to the future development of the NCPA agent.

Thanks again Scott,
Danny
% User Time.png
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NCPA Passive windowscounters Failures

Post by scottwilkerson »

Actually, that one did work but I did have to add a sleep

such as

Code: Select all

%HOSTNAME%|Perfmon Processor Total Percent User Time = /windowscounters/Processor(*)/% User Time?check=true&sleep=3
However, we did some further looking into this and did find that some of the items you have listed we could not get working correctly, specifically the items like
LogicalDisk(_Total)/Avg. Disk Bytes/Read whereas in the counter there is normally a / in the counter name LogicalDisk(_Total)\Avg. Disk Bytes/Read

We haven't yet been able to isolate why this is occurring
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
onegative
Posts: 173
Joined: Tue Feb 17, 2015 12:06 pm

Re: NCPA Passive windowscounters Failures

Post by onegative »

@scottwilkerson

Hey that worked...I missed the sleep parameter on those three.

Code: Select all

2019-03-06 13:05:22,420:INFO:ncpacheck:Running check: /windowscounters/Process(_Total)/Working Set - Private?check=true
2019-03-06 13:05:22,529:INFO:ncpacheck:Running check: /windowscounters/Processor(*)/% Privileged Time?check=true&sleep=3
2019-03-06 13:05:25,622:INFO:ncpacheck:Running check: /windowscounters/Processor(*)/% Processor Time?check=true&sleep=3
2019-03-06 13:05:28,717:INFO:ncpacheck:Running check: /windowscounters/Processor(*)/% User Time?check=true&sleep=3
2019-03-06 13:05:31,809:INFO:ncpacheck:Running check: /windowscounters/Server/Bytes Received/sec?check=true&sleep=3
% Processor Time.png
Thanks for your help on those three...it is really strange about the naming conventions and whether or not sampling is required...
Danny
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NCPA Passive windowscounters Failures

Post by scottwilkerson »

Just an update, we confirmed a bug and have filed the bug report here
https://github.com/NagiosEnterprises/ncpa/issues/520
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
onegative
Posts: 173
Joined: Tue Feb 17, 2015 12:06 pm

Re: NCPA Passive windowscounters Failures

Post by onegative »

@scottwilkerson

Hey Scott,

Thanks for helping me verify this...it helps having someone else proof the assertions.
Go ahead and lock this Topic...I will track the bug report.

I appreciate your team's help,
Danny
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NCPA Passive windowscounters Failures

Post by scottwilkerson »

onegative wrote:@scottwilkerson

Hey Scott,

Thanks for helping me verify this...it helps having someone else proof the assertions.
Go ahead and lock this Topic...I will track the bug report.

I appreciate your team's help,
Danny
Locking thread
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked