Page 1 of 3

WMI cpu monitoring having issues

Posted: Wed Jan 29, 2020 5:30 am
by lgaddam
Many of the Windows remote servers, we received unknown alerts with wmic error for CPU and uptime services.
But for teh same servers Disk & Memory services are working fine.


Below is the error:
---------------------

"UNKNOWN - The WMI query had problems. The error text from wmic is: [wmi/wmic.c:212:main()] ERROR: Retrieve result data.
ERROR: Retrieve result data."


>With help of windows admin, we have restarted WMI service on few windows remote machines but still the issue persists. Kindly help me if you have any solution.

Re: WMI cpu monitoring having issues

Posted: Wed Jan 29, 2020 1:29 pm
by mbellerue
Can you PM a system profile to me? To get a profile, go to Configure -> Core Config Manager -> System Profile -> Download Profile.

Also if you can give me the name of one of the hosts that is having this issue, that would be great. I will take a look and make sure the configuration is good. Are these new checks, or had they been working previously and suddenly stopped?

Re: WMI cpu monitoring having issues

Posted: Fri Jan 31, 2020 3:21 am
by lgaddam
I am not able to see system profile as you mentioned .

Configure -> Core Config Manager -> System Profile -> Download Profile

But I have downloaded System profile by going this way in our Nagios xi.
Admin->System config->System Profile->Download

I have attached the zip file here.

Below are the outputs.

[nagios@nagiosp01 ~]$ /usr/local/nagios/libexec/check_wmi_plus.pl -H 192.168.72.242 -u 'xxxxx' -p 'xxxxxx' -m checkcpu -w '90' -c '95' -t 500
UNKNOWN - The WMI query had problems. You might have your username/password wrong or the user's access level is too low. Wmic error text on the next line.
[wmi/wmic.c:196:main()] ERROR: Login to remote object.
NTSTATUS: NT_STATUS_ACCESS_DENIED - Access denied

[root@nagiosp01 ~]# /usr/local/nagios/libexec/check_wmi_plus.pl -H 192.168.51.3 -u 'usr/adm.patrol' -p 'Volandovoy2012' -m checkdrivesize -a 'C': -w '85' -c '95'
OK - C: Total=135.96GB, Used=68.79GB (50.6%), Free=67.17GB (49.4%) |'C: Space'=68.79GB; 'C: Utilisation'=50.6%;85;95;

Re: WMI cpu monitoring having issues

Posted: Fri Jan 31, 2020 4:03 am
by lgaddam
I missed one more error observed in Nagios XI.
Few services are working fien and other services CPU and uptime not working on atleast 20+ servers.
Attached is the example one.

Output is provided below.

[root@nagiosp01 ~]# /usr/local/nagios/libexec/check_wmi_plus.pl -H 172.31.115.157 -u 'xxxxx' -p 'xxxxxx' -m checkcpu -w '90' -c '95' -t 500
UNKNOWN - The WMI query had problems. The error text from wmic is: [wmi/wmic.c:212:main()] ERROR: Retrieve result data.
NTSTATUS: NT code 0x80041017 - NT code 0x80041017
[root@nagiosp01 ~]#

Re: WMI cpu monitoring having issues

Posted: Fri Jan 31, 2020 2:56 pm
by mbellerue
Alright, it looks like these WMI checks are actually fairly CPU intensive. They're also probably taking a while to authenticate with the remote Windows machines. When you downloaded your profile (you're absolutely correct by the way, I gave you the wrong path to download the profile, my apologies) your system load average was 14+ for one minute, five minute, and fifteen minute. MySQL and httpd were top offenders, and that's pretty common. Just below those, I saw a large block of check_wmi_plus commands. What I'm seeing here says that your machine isn't able to complete these WMI checks before they hit their timeouts.

You have roughly 6000 of these WMI checks, all set to check every 5 minutes. If you could grow the check interval to 10 minutes, that would likely help. Also, most of the checks are set to retry every 1 minute. If you could bump that out to 2 or 3 minutes, that would definitely help.

Or, if you could off-load the checks to an agent on at least some of the servers, that would also help. Especially if those checks were moved to passive checks rather than active checks.

Or if you can add more CPUs, or faster CPUs to the machine, that would help. I don't know if this is a physical server or a VM.

Re: WMI cpu monitoring having issues

Posted: Mon Feb 03, 2020 4:55 am
by lgaddam
HI,


You have roughly 6000 of these WMI checks, all set to check every 5 minutes. If you could grow the check interval to 10 minutes, that would likely help. Also, most of the checks are set to retry every 1 minute. If you could bump that out to 2 or 3 minutes, that would definitely help.

Let me try for 2-3 servers and update you the status.

Or if you can add more CPUs, or faster CPUs to the machine, that would help. I don't know if this is a physical server or a VM.
[/quote]

This is a physical server.

Re: WMI cpu monitoring having issues

Posted: Mon Feb 03, 2020 4:59 am
by lgaddam
Its not only CPU there are other windows 2003 servers which we are not able to get any performance data from them via WMI.
Example we have attached for one server except ping service no other service is getting data for monitoring.

On remote machine, we have checked
1.WMI admin account - Available
& working

2.WMI is running - Restarted

3.No firewall

4.135 port opened from Nagios server to remote windows machine


After all these checks also still the issue persists, kinldy help us to get it fixed.

Re: WMI cpu monitoring having issues

Posted: Mon Feb 03, 2020 2:38 pm
by mbellerue
Okay, it looks like we may be dealing with two different error messages right now. Let's start with SRVLICENCON. First, let's run this check manually. The error message states that there could be more to the error message. ssh into your Nagios server and run this command,

Code: Select all

/usr/local/nagios/libexec/check_wmi_plus.pl -H 192.168.72.242 -u domain/username -p password -m checkservice -a 'msc' -c 0
Give that a try and let me know how long it takes to run, and what the result is.

Re: WMI cpu monitoring having issues

Posted: Tue Feb 04, 2020 2:51 am
by lgaddam
Hi,

below is the output...

[root@nagiosp01 ~]# /usr/local/nagios/libexec/check_wmi_plus.pl -H 192.168.72.242 -u 'xxxxl' -p 'xxxx2' -m checkservice -a 'msc' -c 0
UNKNOWN - The WMI query had problems. You might have your username/password wrong or the user's access level is too low. Wmic error text on the next line.
[wmi/wmic.c:196:main()] ERROR: Login to remote object.
NTSTATUS: NT_STATUS_ACCESS_DENIED - Access denied
[root@nagiosp01 ~]#


I have also run below command and provided output in attachment....

/usr/local/nagios/libexec/check_wmi_plus.pl -H 192.168.72.242 -u 'xxxxx' -p 'xxxx' -m checkcpu -d

Re: WMI cpu monitoring having issues

Posted: Tue Feb 04, 2020 4:23 pm
by mbellerue
So that is just straight up saying access denied. Generally whenever I see that, the login in question isn't allowed to remotely access a machine. Can you login to the Windows machine as that user, or become that user on the Windows machine and try to execute the WMI call? If it fails when trying to run it locally, then there is some permissions issue between the login and WMI. If it succeeds, then there's an issue of remote access.