Hi,
I'm trying to configure this check_hp plugin to work with NagiosXI. http://exchange.nagios.org/directory/Pl ... hp/details
So far, I've done the following step below. However, I'm not sure how to proceed as there is no documentation. Would you guys be able to give me some assistance?
1) Configured SNMP communication between Linux server and Windows Client.
2) Tested check_hp script
[root@nagiostest1 check_hp-2.16]# ./check_hp -H 192.x.x.x -C public
Compaq/HP Agent Check: overall system state OK
3) Copied the check_hp script into /usr/local/nagios/libexec. Then defined check_hp as a command in CCM as $USER1$/check_hp
Thank you as always,
-klee
check_hp plugin
Re: check_hp plugin
Well, you have tested the script and created a command. The last thing you need to do is create a service check in the ccm for it. See the following doc:
http://assets.nagios.com/downloads/nagi ... ios-XI.pdf
http://assets.nagios.com/downloads/nagi ... ios-XI.pdf
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: check_hp plugin
Thanks Abrist,
I defined the host and created a service for check_hp pluggin.
define command {
command_name check_hp
command_line $USERS1$/check_hp -H $HOSTADDRESS$ -C $ARG1$ -d
}
Now I'm getting this error:
Jul 22 12:50:24 Nagios1 nagios: Warning: Return code of 127 for check of service 'Check HP with HP Insight Manager' on host '192.x.x.x' was out of bounds. Make sure the plugin you're trying to run actually exists
I defined the host and created a service for check_hp pluggin.
define command {
command_name check_hp
command_line $USERS1$/check_hp -H $HOSTADDRESS$ -C $ARG1$ -d
}
Now I'm getting this error:
Jul 22 12:50:24 Nagios1 nagios: Warning: Return code of 127 for check of service 'Check HP with HP Insight Manager' on host '192.x.x.x' was out of bounds. Make sure the plugin you're trying to run actually exists
Re: check_hp plugin
The command line variable is $USER1$ (singular, not plural).
Mine is configured as:
and I call it with:
Mine is configured as:
Code: Select all
$USER1$/check_hp -H $HOSTADDRESS$ -C $ARG1$ $ARG2$ $ARG3$ $ARG4$Code: Select all
$ARG1$ = Community String
$ARG2$ = -x cpqFcaHostCntlrStatus,cpqNicIfPhysAdapterStatus
Re: check_hp plugin
Thank you belvdr.
I did indeed mistype $USER$ as plural
I shall using your $ARG1$ $ARG2$ format and see if it works.
I do have a question though: when the CHECK_HP script is run correctly, it returns "Compaq/HP Agent Check: overall system state OK".
If that is the case, why do we have to check components individually by using: $ARG2$ = -x cpqFcaHostCntlrStatus,cpqNicIfPhysAdapterStatus ?
... and if we can delimit multiple components using commas, why do we need the additional $ARG3$ $ARG4$ ?
Much appreciated.
-klee
I did indeed mistype $USER$ as plural
I do have a question though: when the CHECK_HP script is run correctly, it returns "Compaq/HP Agent Check: overall system state OK".
If that is the case, why do we have to check components individually by using: $ARG2$ = -x cpqFcaHostCntlrStatus,cpqNicIfPhysAdapterStatus ?
... and if we can delimit multiple components using commas, why do we need the additional $ARG3$ $ARG4$ ?
Much appreciated.
-klee
Re: check_hp plugin
You can use $ARG3$ and $ARG4$ to pass some other flags (port, timeout, etc.)...... and if we can delimit multiple components using commas, why do we need the additional $ARG3$ $ARG4$ ?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: check_hp plugin
Thank lmiltchev,
Any idea on part 1 of my question?
Because, right now, I'm checking all of the following components since"./check_hp --help" claims they're supported. The check reports "overall system state OK"; meanwhile there's no tape drive installed on this server.
Thanks Again,
-klee
Any idea on part 1 of my question?
Also, is it possible to get metrics on a more granular level (i.e. status of individual components), rather than of a blanket "Compaq/HP Agent Check: overall system state OK"?...when the CHECK_HP script is run correctly, it returns "Compaq/HP Agent Check: overall system state OK".
If that is the case, why do we have to check components individually by using: $ARG2$ = -x cpqFcaHostCntlrStatus,cpqNicIfPhysAdapterStatus ?
Because, right now, I'm checking all of the following components since"./check_hp --help" claims they're supported. The check reports "overall system state OK"; meanwhile there's no tape drive installed on this server.
Any assistance would be much appreciated.Currently the module supports the following components:
cpqHeThermalCpuFanStatus,
cpqNicIfLogMapStatus,
cpqHeFltTolFanCondition,
cpqDaLogDrvStatus,
cpqDaLogDrvCondition,
cpqDaTapeDrvStatus,
cpqHeFltTolPwrSupplyCondition,
cpqHeResilientMemCondition,
cpqNicIfPhysAdapterStatus,
cpqRackPowerSupplyCondition,
cpqHeFltTolPowerSupplyCondition,
cpqDaPhyDrvStatus,
cpqHeEventLogCondition,
cpqDaPhyDrvCondition,
cpqFcaHostCntlrStatus,
cpqSeCpuStatus,
cpqHeTemperatureCondition,
cpqHeThermalSystemFanStatus,
cpqDaPhyDrvSmartStatus,
cpqDaCntlrCondition,
cpqRackCommonEnclosureFanCondition
Thanks Again,
-klee
Last edited by klee on Wed Jul 23, 2014 2:07 pm, edited 1 time in total.
Re: check_hp plugin
Just a follow up to my previous comment. So I just ran the debug option and got the result below, which is really more of what I'm looking for.
I will, of course, remove the unrecognized hardware.
However, if that is the standard, then I'll have to be OK with that. Sorry, but I'm a total new to this
Any advice would be greatly appreciated.
-klee
I will, of course, remove the unrecognized hardware.
This brings us to CHECK_HP Author, Günther Mair's statement below../check_hp -H 192.x.x.x -C public -d cpqHeThermalCpuFanStatus,cpqNicIfLogMapStatus,cpqHeFltTolFanCondition,cpqDaLogDrvStatus,cpqDaLogDrvCondition,cpqDaTapeDrvStatus,cpqHeFltTolPwrSupplyCondition,
cpqHeResilientMemCondition,cpqNicIfPhysAdapterStatus,cpqRackPowerSupplyCondition,cpqHeFltTolPowerSupplyCondition,cpqDaPhyDrvStatus,cpqHeEventLogCondition,
cpqDaPhyDrvCondition,cpqFcaHostCntlrStatus,cpqSeCpuStatus,cpqHeTemperatureCondition,cpqHeThermalSystemFanStatus,cpqDaPhyDrvSmartStatus,cpqDaCntlrCondition,
cpqRackCommonEnclosureFanCondition
Compaq/HP Agent Check:
cpqHeThermalCpuFanStatus.0 = 1 status of the fan(s) (other)
cpqNicIfLogMapStatus.2 = 2 status of the NIC logical group (ok)
cpqNicIfLogMapStatus.1 = 1 status of the NIC logical group (unknown)
cpqHeFltTolFanCondition.0.4 = 2 condition of the fan (0.4:ok)
cpqHeFltTolFanCondition.0.2 = 2 condition of the fan (0.2:ok)
cpqHeFltTolFanCondition.0.1 = 2 condition of the fan (0.1:ok)
cpqHeFltTolFanCondition.0.5 = 2 condition of the fan (0.5:ok)
cpqHeFltTolFanCondition.0.6 = 2 condition of the fan (0.6:ok)
cpqHeFltTolFanCondition.0.3 = 2 condition of the fan (0.3:ok)
cpqDaLogDrvStatus.2.1 = 2 logical drive status (2.1:ok)
cpqDaLogDrvCondition.2.1 = 2 logical drive and associated physical state (2.1:ok)
cpqDaTapeDrvStatus tape drive status - 1.3.6.1.4.1.232.3.2.9.1.1.8 (OID-tree not found, ignoring)
cpqHeFltTolPwrSupplyCondition.0 = 2 overall condition of power supply subsystem (ok)
cpqHeResilientMemCondition.0 = 2 condition of the memory protection subsystem (ok)
cpqNicIfPhysAdapterStatus.1 = 2 physical adapter status (ok)
cpqNicIfPhysAdapterStatus.2 = 2 physical adapter status (ok)
cpqRackPowerSupplyCondition condition of the power supply - 1.3.6.1.4.1.232.22.2.5.1.1.1.17 (OID-tree not found, ignoring)
cpqHeFltTolPowerSupplyCondition.0.2 = 2 condition of the power supply (0.2:ok)
cpqHeFltTolPowerSupplyCondition.0.1 = 2 condition of the power supply (0.1:ok)
cpqDaPhyDrvStatus.2.0 = 2 physical drive status (2.0:ok)
cpqDaPhyDrvStatus.2.1 = 2 physical drive status (2.1:ok)
cpqDaPhyDrvStatus.2.2 = 2 physical drive status (2.2:ok)
cpqHeEventLogCondition overall IML entries - 1.3.6.1.4.1.232.6.2.11.2.0 (OID-tree not found, ignoring)
cpqDaPhyDrvCondition.2.1 = 2 physical drive condition (2.1:ok)
cpqDaPhyDrvCondition.2.2 = 2 physical drive condition (2.2:ok)
cpqDaPhyDrvCondition.2.0 = 2 physical drive condition (2.0:ok)
cpqFcaHostCntlrStatus fibre channel host controller status - 1.3.6.1.4.1.232.16.2.7.1.1.4 (OID-tree not found, ignoring)
cpqSeCpuStatus.1 = 2 CPU status (ok)
cpqSeCpuStatus.0 = 2 CPU status (ok)
cpqHeTemperatureCondition.0.20 = 2 temperature sensor condition (0.20:ok)
cpqHeTemperatureCondition.0.10 = 2 temperature sensor condition (0.10:ok)
cpqHeTemperatureCondition.0.3 = 2 temperature sensor condition (0.3:ok)
cpqHeTemperatureCondition.0.8 = 2 temperature sensor condition (0.8:ok)
cpqHeTemperatureCondition.0.30 = 2 temperature sensor condition (0.30:ok)
cpqHeTemperatureCondition.0.21 = 2 temperature sensor condition (0.21:ok)
cpqHeTemperatureCondition.0.25 = 2 temperature sensor condition (0.25:ok)
cpqHeTemperatureCondition.0.12 = 2 temperature sensor condition (0.12:ok)
cpqHeTemperatureCondition.0.1 = 2 temperature sensor condition (0.1:ok)
cpqHeTemperatureCondition.0.9 = 2 temperature sensor condition (0.9:ok)
cpqHeTemperatureCondition.0.2 = 2 temperature sensor condition (0.2:ok)
cpqHeTemperatureCondition.0.23 = 2 temperature sensor condition (0.23:ok)
cpqHeTemperatureCondition.0.19 = 2 temperature sensor condition (0.19:ok)
cpqHeTemperatureCondition.0.4 = 2 temperature sensor condition (0.4:ok)
cpqHeTemperatureCondition.0.7 = 2 temperature sensor condition (0.7:ok)
cpqHeTemperatureCondition.0.6 = 2 temperature sensor condition (0.6:ok)
cpqHeTemperatureCondition.0.29 = 2 temperature sensor condition (0.29:ok)
cpqHeTemperatureCondition.0.26 = 2 temperature sensor condition (0.26:ok)
cpqHeTemperatureCondition.0.22 = 2 temperature sensor condition (0.22:ok)
cpqHeTemperatureCondition.0.5 = 2 temperature sensor condition (0.5:ok)
cpqHeTemperatureCondition.0.11 = 2 temperature sensor condition (0.11:ok)
cpqHeTemperatureCondition.0.24 = 2 temperature sensor condition (0.24:ok)
cpqHeThermalSystemFanStatus.0 = 2 status of the processor fan(s) (ok)
cpqDaPhyDrvSmartStatus.2.2 = 2 physical drive S.M.A.R.T status (2.2:ok)
cpqDaPhyDrvSmartStatus.2.0 = 2 physical drive S.M.A.R.T status (2.0:ok)
cpqDaPhyDrvSmartStatus.2.1 = 2 physical drive S.M.A.R.T status (2.1:ok)
cpqDaCntlrCondition.2 = 2 controller status (ok)
cpqRackCommonEnclosureFanCondition condition of the rack fan - 1.3.6.1.4.1.232.22.2.3.1.3.1.11 (OID-tree not found, ignoring)
Can anyone actually attest to how this monitor is supposed to be run. As I mentioned earlier, I was hoping the monitor would report more detail than just "overall system state OK".Please do not misread the "-d" parameter! The "-d" parameter stands for "DEBUG" and is not intended for production use inside Nagios! check_hp will give you information about which objects failed if there are any.
However, if that is the standard, then I'll have to be OK with that. Sorry, but I'm a total new to this
-klee
Re: check_hp plugin
What happens when one of the components is in "non-OK" state? Do you get more details then?Can anyone actually attest to how this monitor is supposed to be run. As I mentioned earlier, I was hoping the monitor would report more detail than just "overall system state OK".
As this is a 3rd party plugin, you best bet would be to contact the plugin's author and request more info on the usage.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: check_hp plugin
I’ve confirmed with the author of the CHECK_HP plugin that the "overall system state OK" message is the standard return from this script when no problem is found.
If any or more problems are found, you will get the descriptions (in format of debug mode) + respective error codes instead.
Issue resolved, please close thread.
Thanks,
-klee
If any or more problems are found, you will get the descriptions (in format of debug mode) + respective error codes instead.
Issue resolved, please close thread.
Thanks,
-klee