This support forum board is for support questions relating to
Nagios XI , our flagship commercial network monitoring solution.
Sargento
Posts: 42 Joined: Tue Feb 23, 2021 6:32 am
Post
by Sargento » Tue Sep 07, 2021 9:13 am
Hello,
At random we are experiencing segmentation faults while running check_vmware_api. Here is an example..
Code: Select all
/usr/local/nagios/libexec/check_vmware_api.pl -f creds.file -l cpu -w 80 -c 90 -H host -v
CHECK_VMWARE_API.PL OK - cpu usage=1430.00 MHz (3.89%) | cpu_usagemhz=1430.00;80;90 cpu_usage=3.89%;80;90
..great!? Send it into production.. I then run one more test, running the same command, exactly the same, again..
Code: Select all
/usr/local/nagios/libexec/check_vmware_api.pl -f creds.file -l cpu -w 80 -c 90 -H host -v
Segmentation fault
Why could this be? This also occurs while trying to create a session file for this host.
Out of 3 servers I have tested this on this is the only 1 client with this issue.
Thank you,
Rainger
ssax
Dreams In Code
Posts: 7682 Joined: Wed Feb 11, 2015 12:54 pm
Post
by ssax » Tue Sep 07, 2021 4:20 pm
What version of the plugin are you running?
Code: Select all
/usr/local/nagios/libexec/check_vmware_api.pl -V
What version of vmware esxi is it running?
Which version of the vmware perl SDK did you install?
What does it show when it segfaults with -vvv --trace=4 on the command?
Code: Select all
/usr/local/nagios/libexec/check_vmware_api.pl -f creds.file -l cpu -w 80 -c 90 -H host -vvv --trace=4
Sargento
Posts: 42 Joined: Tue Feb 23, 2021 6:32 am
Post
by Sargento » Tue Sep 07, 2021 4:34 pm
ssax wrote: What version of the plugin are you running?
Code: Select all
/usr/local/nagios/libexec/check_vmware_api.pl -V
What version of vmware esxi is it running?
Which version of the vmware perl SDK did you install?
What does it show when it segfaults with -vvv --trace=4 on the command?
Code: Select all
/usr/local/nagios/libexec/check_vmware_api.pl -f creds.file -l cpu -w 80 -c 90 -H host -vvv --trace=4
API = check_vmware_api.pl 0.7.1
VMWARE ESXI = 6.7
Perl SDK Version = VMware-vSphere-Perl-SDK-7.0.0-17698549.x86_64
Extra module needed after previous issue where no authentication worked = libwww-perl-5.837
When I ran with -vvv --trace=4 I only see
"Segmentation fault" . There is no other data put to the console. I verified this on my end a few times.
ssax
Dreams In Code
Posts: 7682 Joined: Wed Feb 11, 2015 12:54 pm
Post
by ssax » Tue Sep 07, 2021 5:08 pm
Is this the plugin that has been modified to use the session file?
The only other methods to see where it's failing would be:
Code: Select all
cpan -i Devel::Trace
perl -d:Trace /usr/local/nagios/libexec/check_vmware_api.pl -f creds.file -l cpu -w 80 -c 90 -H host
Or:
Code: Select all
yum install strace
strace -f /usr/local/nagios/libexec/check_vmware_api.pl -f creds.file -l cpu -w 80 -c 90 -H host
Sargento
Posts: 42 Joined: Tue Feb 23, 2021 6:32 am
Post
by Sargento » Thu Sep 09, 2021 7:20 am
This is using any authentication. When I use a session file it segfaults, when I use user and pass it segfaults, when I use a user/pass file it segfaults. I've attached the tracefile. What should I be looking for in this file? Again this appears to be the only server in which this issue arises, we can do it on other vmware hosts (probably different release versions) and not have this issue.
I've attached an error and working tracefile. The working one is what we see when the command runs without fault and the error is when we see a segfault. Both of these are the trace from the same command on the same exact server, it's just random when it segfaults.
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682 Joined: Wed Feb 11, 2015 12:54 pm
Post
by ssax » Fri Sep 10, 2021 4:09 pm
Based on this in the failing one:
Code: Select all
read(5, 0x1ff9453, 5) = -1 ECONNRESET (Connection reset by peer)
Do you have any security software on the system OR in the network path that could be impacting it and flagging it as a threat
sometimes ?
That would make a lot of sense and is usually the first thing I suspect when I see connection resets outside of network issues.
A you seeing that on all VMWare checks or just some? All VMWare hosts or just a specific subset?
Sargento
Posts: 42 Joined: Tue Feb 23, 2021 6:32 am
Post
by Sargento » Tue Sep 14, 2021 9:24 am
No security software blocking it, just verified. It only seems to affect this one specific ESXi host. I haven't had a single seg fault on any other ESXi host yet, and believe me I tried.
ssax
Dreams In Code
Posts: 7682 Joined: Wed Feb 11, 2015 12:54 pm
Post
by ssax » Tue Sep 14, 2021 5:32 pm
Try doing this method:
Code: Select all
cpan -i Devel::Trace
perl -d:Trace /usr/local/nagios/libexec/check_vmware_api.pl -f creds.file -l cpu -w 80 -c 90 -H host
Sargento
Posts: 42 Joined: Tue Feb 23, 2021 6:32 am
Post
by Sargento » Wed Sep 15, 2021 8:33 am
Hello,
I've done the perl trace and am reading through the output. I've attached the file below for you to help me through this.
I do really appreciate your help so far.
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682 Joined: Wed Feb 11, 2015 12:54 pm
Post
by ssax » Wed Sep 15, 2021 5:49 pm
You received a segfault on that output, right?
What is the output of this command?