VMWARE wizard; checks time out as services, work at cmd line

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
WillH
Posts: 54
Joined: Mon Aug 03, 2020 10:37 am

VMWARE wizard; checks time out as services, work at cmd line

Post by WillH »

When running service checks on CPU, I/O, memory and networking we are getting a time out error
When running the command manually from the XI's cmd line, we get a valid return
Example
/usr/local/nagios/libexec/check_vmware_api.pl -H "IPADDRESS" -f "/usr/local/nagiosxi/etc/components/vmware/<AUTHFILENAME>_auth.txt" -l 'CPU'
Gives us a valid return of
CHECK_VMWARE_API.PL OK - cpu usage=2152.00 MHz (4.90%) | cpu_usagemhz=2152.00;; cpu_usage=4.90%;;

In the XI interface, the service checks return (Service check timed out after 60.01 seconds)
Other checks, like Services, datastore and VM Host status return fine

Running
XI 5.7.2
perl 5, version 32, subversion 0 (v5.32.0) built for x86_64-linux
also see issue with
perl 5, version 16, subversion 3 (v5.16.3)
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: VMWARE wizard; checks time out as services, work at cmd

Post by vtrac »

Hi,
Hope you are having a great Tuesday!! ... :-)

Please try edit the "/usr/local/nagios/etc/nagios.cfg" file and change
From:

Code: Select all

service_check_timeout=60
To:

Code: Select all

service_check_timeout=120
Restart services:

Code: Select all

systemctl restart nagios
systemctl restart httpd

Also, please try adding the "-t 120" option to the "check_vmware_api.pl" calls.


Best Regards,
Vinh
WillH
Posts: 54
Joined: Mon Aug 03, 2020 10:37 am

Re: VMWARE wizard; checks time out as services, work at cmd

Post by WillH »

New result
(Service check timed out after 120.01 seconds)

The command line check still gives an accurate result. :(
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: VMWARE wizard; checks time out as services, work at cmd

Post by vtrac »

Hi,
If you add the "time" at the beginning of the command (below):

Code: Select all

time /usr/local/nagios/libexec/check_vmware_api.pl -H "IPADDRESS" -f "/usr/local/nagiosxi/etc/components/vmware/<AUTHFILENAME>_auth.txt" -l 'CPU'
What outputs do you get?

Does it takes longer than 120 seconds to run in the "manual" mode?

What we want is how long does it takes to run this command maually, then increase the timeout accordingly.


Regards,
Vinh
WillH
Posts: 54
Joined: Mon Aug 03, 2020 10:37 am

Re: VMWARE wizard; checks time out as services, work at cmd

Post by WillH »

CHECK_VMWARE_API.PL OK - cpu usage=4837.00 MHz (11.01%) | cpu_usagemhz=4837.00;; cpu_usage=11.01%;;
It takes between 2-3 seconds to run from the command line in manual mode, regardless of time specified.
It always times out when run as a service check in the UI, regardless of the global or -t timeout specified.
I'm running the command line copied from the "Run Check Command" option in the UI.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: VMWARE wizard; checks time out as services, work at cmd

Post by vtrac »

Hi,
Hope you are having a great Wednesday!! ... :-)

The vSphere SDKs are pretty heavy in terms of the resources required to leverage them.

Please edit your /etc/php.ini and change these:

Code: Select all

max_execution_time = 60
max_input_time = 120
max_input_vars = 5000
memory_limit = 256M
To these:

Code: Select all

max_execution_time = 600
max_input_time = 600
max_input_vars = 25000
memory_limit = 1280M
Then restart apache:

Code: Select all

systemctl restart httpd

I also checked with my team and was suggested that you try running the "check_vmware_api.pl" as "nagios" user on the command line:

Code: Select all

su - nagios

usr/local/nagios/libexec/check_vmware_api.pl -H "IPADDRESS" -f "/usr/local/nagiosxi/etc/components/vmware/<AUTHFILENAME>_auth.txt" -l 'CPU'
Also, please make sure the "authfile" has the correct permission:

Code: Select all

chmod 666 /usr/local/nagiosxi/etc/components/vmware/<AUTHFILENAME>_auth.txt
Once last thing, please make sure the command defined (used) in the GUI is correct.
Please open Nagios XI GUI > Configure > Core Config Manager > Services
Open the service that has "check_vmware_api.pl" define and click the "Run check command" button.

Please post screenshot of the "Run check command" outputs.

I also looked at a similar forum ticket (below) and noticed that they used the "-t 300" in their command:
You might want to try that.
https://support.nagios.com/forum/viewto ... =7&t=41706


Best Regards,
Vinh
WillH
Posts: 54
Joined: Mon Aug 03, 2020 10:37 am

Re: VMWARE wizard; checks time out as services, work at cmd

Post by WillH »

No change after making PHP changes, BUT, BUT
running commands sudo to nagios (and kicking myself for not doing this yesterday) just hangs there.

So where we stand now, running commands as root is happy, nagios is not.

So more data, narrowing in on the problem.
***************************

The credentials file is -rw-rw-rw- 1 apache nagios
For the folder it's in drwxrwsrwx 2 apache nagios

Bear in mind that other calls in the UI are working

Example, check_vmware_api.pl -l "SERVICE" works in the UI and when run from the command line as either root or nagios

I can substitute the auth file for the username pw combo using -p -u, and the results are the same

root can run all commands from the cmd line using either -f or -u/-p

nagios can run the following using either -f or -u/-p
vmfs
runtime
cluster

nagios cannot run using either -f or -u/-p
mem
cpu
io
net
(again, to be clear, all run as root)
:?

giving the check 5 minutes to time out does not resolve the issue.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: VMWARE wizard; checks time out as services, work at cmd

Post by vtrac »

Hi,
Please check with the people in your end and see if they can give "nagios" user permission to connect to your WMWare system as this has comes down to permission related issue(s).


Best Regards,
Vinh
WillH
Posts: 54
Joined: Mon Aug 03, 2020 10:37 am

Re: VMWARE wizard; checks time out as services, work at cmd

Post by WillH »

It's an issue on the XI, not vcenter.

If I add nagios to the root group, things work.

So I'll need to know what commands or files the check_vmware_api.pl it trying to invoke when doing metric type calls like CPU and mem, so I can get with our engineering team to update our sudo rights.
WillH
Posts: 54
Joined: Mon Aug 03, 2020 10:37 am

Re: VMWARE wizard; checks time out as services, work at cmd

Post by WillH »

Vinh,
I am not trying to be rude, but I am not sure if you're reciprocating. I appreciate what help you've been able to give.

I will reach out to the author of the pl to see if he has any suggestions.
Locked