Page 1 of 2

UNKNOWN:Guest is powered OFF or is not accesible

Posted: Sun Oct 16, 2016 9:52 pm
by kwhogster
This is the error UNKNOWN:Guest is powered OFF or is not accesible, cannot collect data! [Guest_CPU_Usage]

Background

I just built a vCenter 6.0 and a new Host ESXi 6.0

I had my original ESXi Host 6.0 and now a have a DataCenter setup with both ESXi Hosts

I moved some of the VM's to the new HOST

I have the Guest CPU Usage service defined for all my VM's

The strange thing is all the VMs filed at first then on the VM's that I moved I had to change the host ip address in the service and then they started working.

The original Host VMs give the error above.


Now all the VMs with this error have many other services that are working fine.

Code: Select all

define service {
        host_name TGCS003
        service_description Guest CPU Usage
        check_command box293_check_vmware_test!10.2.8.10!Guest_CPU_Usage!--guest!TGCS003!!!!
        initial_state u
        max_check_attempts 3
        check_interval 5
        retry_interval 3
        active_checks_enabled 1
        check_period 24x7
        servicegroups   VMCPULoad
        register 1
}
define service {
        host_name TGKW005
        service_description Guest CPU Usage
        check_command box293_check_vmware_test!10.2.8.8!Guest_CPU_Usage!--guest!TGKW005!!!!
        initial_state u
        max_check_attempts 3
        check_interval 5
        retry_interval 3
        active_checks_enabled 1
        check_period 24x7
        servicegroups   VMCPULoad
        register 1
}
10.2.8.10 is my original host

10.2.8.8 is the new host

Any ideas?

Thank you

Tom

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Sun Oct 16, 2016 10:55 pm
by Box293
I've seen this issue before where the management agent on the ESXi host needs restarting, for some reason it's not communicating with vCenter properly.

Can you please post the command definition for box293_check_vmware_test

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Mon Oct 17, 2016 7:03 am
by kwhogster
Hello Troy

Code: Select all

define command{
   command_name box293_check_vmware_test
   command_line $USER1$/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.2.8.7 -C "nice -n19 ~/box293_check_vmware.pl --server $ARG1$ --check $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ $ARG7$ $ARG8$"
}

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Mon Oct 17, 2016 3:12 pm
by dwhitfield
obviously @Box293 is the best person for this job, but I wanted to point out the VMWare KB on the topic: https://kb.vmware.com/selfservice/micro ... Id=1018834

Am I right in understanding that aside from the error, everything is working properly?

I also have a few questions that will help us troubleshoot this issue. What version of Core are you using? Did you install from source or from distribution repos? On what OS is Nagios running?

Thanks!

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Mon Oct 17, 2016 4:31 pm
by Box293
What is the output from executing these commands on the vMA ?

Code: Select all

~/box293_check_vmware.pl --server 10.2.8.8 --check Guest_CPU_Usage --guest TGCS003

~/box293_check_vmware.pl --server 10.2.8.8 --check Guest_CPU_Usage --guest TGKW005

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Mon Oct 17, 2016 6:53 pm
by kwhogster
Troy,

Welcome to vMA
vi-admin@tgkw002:~> ~/box293_check_vmware.pl --server 10.2.8.8 --check Guest_CPU_Usage --guest TGCS003
CRITICAL: communication with the VMware API failed after 2 retries.

vi-admin@tgkw002:~> ~/box293_check_vmware.pl --server 10.2.8.8 --check Guest_CPU_Usage --guest TGKW005
OK: {Free: 10,030 MHz} {Usage: (Total: 74 MHz) (CPU 0: 74 MHz) (CPU 1: 74 MHz) (CPU 2: 74 MHz) (CPU 3: 74 MHz)} {Total Available: 10,104 MHz} {Ready Time: (Total: 83 ms) (CPU 0: 22 ms) (CPU 1: 18 ms) (CPU 2: 21 ms) (CPU 3: 24 ms)}|'Total CPU Free'=10030MHz 'Total CPU Usage'=74MHz 'CPU 0: Usage'=22MHz 'CPU 1: Usage'=16MHz 'CPU 2: Usage'=12MHz 'CPU 3: Usage'=19MHz 'Total Available'=10104MHz 'Total Ready Time'=83ms 'CPU 0 Ready Time'=22ms 'CPU 1 Ready Time'=18ms 'CPU 2 Ready Time'=21ms 'CPU 3 Ready Time'=24ms [Guest_CPU_Usage]
vi-admin@tgkw002:~>


Also I tried this

Restart Management agents in ESXi Using ESXi Shell or Secure Shell (SSH):



1.Log in to ESXi Shell or SSH as root.

For Enabling ESXi Shell or SSH, see Using ESXi Shell in ESXi 5.x and 6.x (2004746).


2.Restart the ESXi host daemon and vCenter Agent services using these commands:

/etc/init.d/hostd restart

/etc/init.d/vpxa restart

No change


Update

I ran this on the esxi host

services.sh restart this restarts all Management services

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Mon Oct 17, 2016 6:58 pm
by kwhogster
to dwhitfield

I am running Nagios Core 4.1

This was all working fine until I added vCenter and a second esxi host.

Again the VMs on the new esxi host are working

The VMs on the original esxi host are not.

ALSO remember I said that the VM's on the original host are working except for just this service

They all are online and running ping works and nslookup works

Something on the original ESXI host is the problem I think

Thanks

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Tue Oct 18, 2016 2:34 pm
by rkennedy
Just to confirm, is TGCS003 currently on the .10 host, or the .8 host?
The strange thing is all the VMs filed at first then on the VM's that I moved I had to change the host ip address in the service and then they started working.
This is expected, because the vmware check is only going to look on that specific host you specify, for the guest VM. Nagios does not know you migrated it to the other host, which would make sense on why it's not working.

I believe the numbers are just mixed up here, could you post two things for us to help compare?
1. A copy of your /usr/local/nagios/var/objects.cache (path may vary depending on how you installed) - Feel free to PM this over if it contains sensitive data.
2. A screenshot showing us the working checks, and the non-working checks.

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Tue Oct 18, 2016 7:32 pm
by kwhogster
rkennedy

screen shot and file is attached.

I rename the objects.cache to objects.cache.txt so I could upload it


Hope this helps you guys figure this out.

Thank you

Re: UNKNOWN:Guest is powered OFF or is not accesible

Posted: Tue Oct 18, 2016 10:07 pm
by Box293
Can you answer this please:
rkennedy wrote:Just to confirm, is TGCS003 currently on the .10 host, or the .8 host?