Page 1 of 1
VMWare host hardware Service monitoring question
Posted: Mon May 07, 2018 11:56 am
by tmattingly
I have a c7000 enclosure with multiple blades but one blade specifically, a Proliant BL460c Gen8 pn:666159-B21 starting failing on the Service-VMWare-hardware service this weekend. When I look in the OA [Onboard Administrator] (onboard blade status) it shows as all green with no errors. Normally 99% of the time this is caused by failed hard drive but not this time but drives are green. None of the OA logs indicate an error except for an ILO time which was reset.
Can you give me an tips or suggestions on how to troubleshoot why Nagios is thinking there is an error? (Screenshot attached of error)
Re: VMWare host hardware Service monitoring question
Posted: Mon May 07, 2018 4:46 pm
by tgriep
Can you show the full command you are using for that service check here so we can check it's settings?
To do that, go to the Core Config Manager > Services menu and search for that service.
Then click on the Floppy Icon on the right hand side of the window for that service which will bring up the configuration settings.
Copy it and paste it to the post so we can see it.
Thanks
Re: VMWare host hardware Service monitoring question
Posted: Mon May 07, 2018 4:51 pm
by tmattingly
###############################################################################
#
# Service configuration file
#
# Created by: Nagios Core Config Manager 2.6.11
# Date: 2018-05-07 13:50:32
# Version: Nagios 3.x config file
#
# --- DO NOT EDIT THIS FILE BY HAND ---
# Nagios CCM will overwrite all manual settings during the next update if you
# would like to edit files manually, place them in the 'static' directory or
# import your configs into the CCM by placing them in the 'import' directory.
#
###############################################################################
define service {
service_description Service-VMWare-hardware
use generic-service
hostgroup_name VMWare DMZ,VMWare Vocera,VMWare-Production
display_name Service-VMWare-hardware
check_command check-esxi-hp-hardware!!!!!!!!
register 1
}
###############################################################################
#
# Service configuration file
#
# END OF FILE
#
###############################################################################
Re: VMWare host hardware Service monitoring question
Posted: Tue May 08, 2018 2:25 pm
by tgriep
Can you go to the Core Config Manager > Commands menu and post how the check-esxi-hp-hardware command is defined?
Also, can you post the plugin here or a link to it so we can view it?
Thanks
Re: VMWare host hardware Service monitoring question
Posted: Tue May 08, 2018 2:43 pm
by tmattingly
I assume the command view is what you need?
/usr/local/nagios/libexec/check_esxi_hardware.py --host=$HOSTADDRESS$ --user=user --pass=password --vendor=hp
BTW, This is the plugin that came with Nagios (at least it appears to have been supplied by Nagios).
We are running 5.4.13
Plugin attached.
Re: VMWare host hardware Service monitoring question
Posted: Tue May 08, 2018 3:37 pm
by tgriep
The first thing you should do is to update it to the latest version which you can find at the following link.
https://www.claudiokuenzler.com/nagios- ... rdware.php
Then, run the command from a shell adding the -v option to get a verbose output when the plugin runs.
Code: Select all
/usr/local/nagios/libexec/check_esxi_hardware.py --host=xxx.xxx.xxx.xxx --user=user --pass=password --vendor=hp -v
The plugin looks to communicate to the server using port 443 if it is not specified so to test it the port is still open, run the following as root on the Nagios server to see if the status of the port is open
Replace xxx.xxx.xxx.xxx with the IP address of the ESX server. If it says that the port is closed, then you will have to see why the ESX server is not listening or accepting connections.
Re: VMWare host hardware Service monitoring question
Posted: Wed May 09, 2018 12:27 pm
by tmattingly
The output is below. However we did discover that this was an older VMWare 5.1 host. We did several things yesterday, we updated the blade with the latest ProLiant SPP (Support Package) [firmware and onboard drivers], updated to VMWare 6.5 and rebooted the system as well. There was some time of 'flappingstart' but then it seemed to settle down and it finally went GREEN. Its been that way for the last 20 hours. I will also be looking at an update for the plugin per your suggestion the latest update is 20180411 so I will be working on that.
I just couldn't understand why it went critical but yet nothing had changed and that was the reason for the ticket.
Code: Select all
20180509 09:05:29 Connection to https://xx.xx.xx.xx (not a real address)
20180509 09:05:29 Found pywbem version 0.7.0
20180509 09:05:29 Connection worked
20180509 09:05:29 Check classe OMC_SMASHFirmwareIdentity
20180509 09:05:30 Element Name = System BIOS
20180509 09:05:30 VersionString = I31
20180509 09:05:30 Check classe CIM_Chassis
20180509 09:05:31 Element Name = Chassis
20180509 09:05:31 Manufacturer = HP
20180509 09:05:31 SerialNumber = CN7351010V
20180509 09:05:31 Model = ProLiant BL460c Gen8
20180509 09:05:31 Check classe CIM_Card
20180509 09:05:31 Check classe CIM_ComputerSystem
20180509 09:05:31 Element Name = System Board 7:1
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = System Board 7:2
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = System Board 7:3
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:1
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:2
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:3
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:4
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:5
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:6
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:7
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:8
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:9
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = Add-in Card 11:10
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = FH1VM1PROD1
20180509 09:05:31 Element Name = Hardware Management Controller (Node 0)
20180509 09:05:31 Element HealthState = 0
20180509 09:05:31 Element Name = HP Smart Array P220i Controller : Embedded : HPSA1
20180509 09:05:31 Element HealthState = 5
20180509 09:05:31 Check classe CIM_NumericSensor
20180509 09:05:32 Element Name = System Board 2 Power Meter
20180509 09:05:32 sensorType = 4 - Current
20180509 09:05:32 BaseUnits = 7
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 0.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = System Board 1 Virtual Fan
20180509 09:05:32 sensorType = 5 - Tachometer
20180509 09:05:32 BaseUnits = 65
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 22.340000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 6 33-Sys Exhaust
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 33.000000
20180509 09:05:32 Upper Threshold Critical = 75.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 5 32-Sys Exhaust
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 26.000000
20180509 09:05:32 Upper Threshold Critical = 75.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 4 31-System Board
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 33.000000
20180509 09:05:32 Upper Threshold Critical = 90.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 3 30-System Board
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 33.000000
20180509 09:05:32 Upper Threshold Critical = 90.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Add-in Card 4 25-LOM Card Zn
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 40.000000
20180509 09:05:32 Upper Threshold Critical = 75.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Add-in Card 3 24-LOM Card
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 40.000000
20180509 09:05:32 Upper Threshold Critical = 100.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Add-in Card 2 23-HDcntlr Inlet
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 40.000000
20180509 09:05:32 Upper Threshold Critical = 70.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Add-in Card 1 22-HD Controller
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 49.000000
20180509 09:05:32 Upper Threshold Critical = 100.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Battery 1 21-SuperCap Max
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 16.000000
20180509 09:05:32 Upper Threshold Critical = 65.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Power System Board 6 20-VR P2 Mem
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 29.000000
20180509 09:05:32 Upper Threshold Critical = 115.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Power System Board 5 19-VR P2 Mem
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 28.000000
20180509 09:05:32 Upper Threshold Critical = 115.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Power System Board 4 18-VR P1 Mem
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 36.000000
20180509 09:05:32 Upper Threshold Critical = 115.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Power System Board 3 17-VR P1 Mem
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 37.000000
20180509 09:05:32 Upper Threshold Critical = 115.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Power System Board 2 16-VR P2
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 33.000000
20180509 09:05:32 Upper Threshold Critical = 115.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Power System Board 1 15-VR P1
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 39.000000
20180509 09:05:32 Upper Threshold Critical = 115.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 2 14-Chipset Zone
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 34.000000
20180509 09:05:32 Upper Threshold Critical = 90.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 1 13-Chipset
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 44.000000
20180509 09:05:32 Upper Threshold Critical = 105.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Disk or Disk Bay 1 12-HD Max
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 35.000000
20180509 09:05:32 Upper Threshold Critical = 60.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Memory Device 8 11-P2 Mem Zone
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 29.000000
20180509 09:05:32 Upper Threshold Critical = 90.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Memory Device 7 10-P2 Mem Zone
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 27.000000
20180509 09:05:32 Upper Threshold Critical = 90.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Memory Device 6 09-P1 Mem Zone
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 32.000000
20180509 09:05:32 Upper Threshold Critical = 90.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Memory Device 5 08-P1 Mem Zone
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 32.000000
20180509 09:05:32 Upper Threshold Critical = 90.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Memory Device 4 07-P2 DIMM 5-8
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 28.000000
20180509 09:05:32 Upper Threshold Critical = 87.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Memory Device 3 06-P2 DIMM 1-4
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 29.000000
20180509 09:05:32 Upper Threshold Critical = 87.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Memory Device 2 05-P1 DIMM 5-8
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 32.000000
20180509 09:05:32 Upper Threshold Critical = 87.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Memory Device 1 04-P1 DIMM 1-4
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 30.000000
20180509 09:05:32 Upper Threshold Critical = 87.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 2 03-CPU 2
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 40.000000
20180509 09:05:32 Upper Threshold Critical = 70.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 1 02-CPU 1
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 40.000000
20180509 09:05:32 Upper Threshold Critical = 70.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Element Name = Other 1 01-Inlet Ambient
20180509 09:05:32 sensorType = 2 - Temperature
20180509 09:05:32 BaseUnits = 2
20180509 09:05:32 Scaled by = 0.010000
20180509 09:05:32 Current Reading = 18.000000
20180509 09:05:32 Upper Threshold Critical = 42.000000
20180509 09:05:32 Element HealthState = 5
20180509 09:05:32 Check classe CIM_Memory
20180509 09:05:33 Element Name = Proc 1 Level-1 Cache
20180509 09:05:33 Element HealthState = 0
20180509 09:05:33 Element Name = Proc 1 Level-2 Cache
20180509 09:05:33 Element HealthState = 0
20180509 09:05:33 Element Name = Proc 1 Level-3 Cache
20180509 09:05:33 Element HealthState = 0
20180509 09:05:33 Element Name = Proc 2 Level-1 Cache
20180509 09:05:33 Element HealthState = 0
20180509 09:05:33 Element Name = Proc 2 Level-2 Cache
20180509 09:05:33 Element HealthState = 0
20180509 09:05:33 Element Name = Proc 2 Level-3 Cache
20180509 09:05:33 Element HealthState = 0
20180509 09:05:33 Element Name = Memory
20180509 09:05:33 Element HealthState = 5
20180509 09:05:33 Check classe CIM_Processor
20180509 09:05:33 Element Name = Proc 1
20180509 09:05:33 Family = 179
20180509 09:05:33 CurrentClockSpeed = 2000MHz
20180509 09:05:33 Element HealthState = 5
20180509 09:05:33 Element Name = Proc 2
20180509 09:05:33 Family = 179
20180509 09:05:33 CurrentClockSpeed = 2000MHz
20180509 09:05:33 Element HealthState = 5
20180509 09:05:33 Check classe CIM_RecordLog
20180509 09:05:34 Element Name = IPMI SEL
20180509 09:05:34 Element HealthState = 5
20180509 09:05:34 Check classe OMC_DiscreteSensor
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: Drive Present
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: Drive Fault
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: Predictive Failure
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: Hot Spare
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: Parity Check In Progress
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: In Critical Array
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: In Failed Array
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: Rebuild In Progress
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 2 C1 P1I Bay 2: Rebuild Aborted
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: Drive Present
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: Drive Fault
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: Predictive Failure
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: Hot Spare
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: Parity Check In Progress
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: In Critical Array
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: In Failed Array
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: Rebuild In Progress
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = Disk or Disk Bay 1 C1 P1I Bay 1: Rebuild Aborted
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = System Board 3 Memory Status: Uncorrectable ECC
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = System Board 3 Memory Status: Correctable ECC logging limit reached
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = System Board 3 Memory Status: Presence Detected
20180509 09:05:35 Element Name = System Chassis 3 Enclosure Status: Sensor access degraded or unavailable
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = System Chassis 3 Enclosure Status: Controller access degraded or unavailable
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = System Chassis 3 Enclosure Status: Management controller off-line
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Element Name = System Chassis 1 UID Light
20180509 09:05:35 Check classe OMC_Fan
20180509 09:05:35 Element Name = Virtual Fan
20180509 09:05:35 Element HealthState = 5
20180509 09:05:35 Check classe OMC_PowerSupply
20180509 09:05:35 Check classe VMware_StorageExtent
20180509 09:05:36 Element Name = Disk 1 on HPSA1 : Port 1I Box 1 Bay 1 : 136GB : Data Disk
20180509 09:05:36 Element HealthState = 5
20180509 09:05:36 Element Name = Disk 2 on HPSA1 : Port 1I Box 1 Bay 2 : 136GB : Data Disk
20180509 09:05:36 Element HealthState = 5
20180509 09:05:36 Check classe VMware_Controller
20180509 09:05:36 Element Name = HP Smart Array P220i Controller : Embedded : HPSA1
20180509 09:05:36 Element HealthState = 5
20180509 09:05:36 Check classe VMware_StorageVolume
20180509 09:05:36 Element Name = Logical Volume 1 on HPSA1 : RAID 1 : 136GB : Disk 1,2
20180509 09:05:36 Element HealthState = 5
20180509 09:05:36 Check classe VMware_Battery
20180509 09:05:37 Element Name = Battery on HPSA1
20180509 09:05:37 Element HealthState = 5
20180509 09:05:37 Check classe VMware_SASSATAPort
OK - Server: HP ProLiant BL460c Gen8 s/n: CN7351010V System BIOS: I31 2013-12-20
Re: VMWare host hardware Service monitoring question
Posted: Wed May 09, 2018 12:52 pm
by tgriep
You said that there was in issue with ILO time. The plugin uses the ILO service to gather the information and if that stopped accepting or returning data, that could be the issue.
Glad it is working now after what you did.
Re: VMWare host hardware Service monitoring question
Posted: Wed May 09, 2018 1:40 pm
by tmattingly
ILO was updated along with other onboard drivers of the Blade (its a bl460c in an HP c7000 enclosure) and the SPP updates it. so yes it could have been. Go ahead and archive the incident and thank you again!
Tom
Re: VMWare host hardware Service monitoring question
Posted: Wed May 09, 2018 1:55 pm
by tgriep
Your welcome. I'll close the post and lock it. If you have any other questions in the future, feel free to open a new post.