That folder is used to keep the state information for the last time the check was run to the current time it is run for performance / data calculations.
Some plugins have the ability to disable saving of the state information but I don't think the check_nwc_health has that ability.
Try running this to see if there is a command line to disable the state history and use that when checking the commands.
tgriep wrote:That folder is used to keep the state information for the last time the check was run to the current time it is run for performance / data calculations.
Some plugins have the ability to disable saving of the state information but I don't think the check_nwc_health has that ability.
Try running this to see if there is a command line to disable the state history and use that when checking the commands.
[nagios@nms1 mibs]$ /usr/local/nagios/libexec/check_nwc_health --help
check_nwc_health $Revision: 4.2 $ [http://labs.consol.de/nagios/check_nwc_health]
This monitoring plugin is free software, and comes with ABSOLUTELY NO WARRANTY.
It may be used, redistributed and/or modified under the terms of the GNU
General Public Licence (see http://www.fsf.org/licensing/licenses/gpl.txt).
This plugin checks various parameters of network components
Usage: check_nwc_health [ -v|--verbose ] [ -t <timeout> ] --mode <what-to-do> --hostname <network-component> --community <snmp-community> ...]
-?, --usage
Print usage information
-h, --help
Print detailed help screen
-V, --version
Print version information
-t, --timeout=INTEGER
Seconds before plugin times out (default: 15)
-v, --verbose
Show details for command-line debugging (can repeat up to 3 times)
--hostname
Hostname or IP-address of the switch or router
--port
The SNMP port to use (default: 161)
--domain
The transport domain to use (default: udp/ipv4, other possible values: udp6, udp/ipv6, tcp, tcp4, tcp/ipv4, tcp6, tcp/ipv6)
--protocol
The SNMP protocol to use (default: 2c, other possibilities: 1,3)
--community
SNMP community of the server (SNMP v1/2 only)
--username
The securityName for the USM security model (SNMPv3 only)
--authpassword
The authentication password for SNMPv3
--authprotocol
The authentication protocol for SNMPv3 (md5|sha)
--privpassword
The password for authPriv security level
--privprotocol
The private protocol for SNMPv3 (des|aes|aes128|3des|3desde)
--contextengineid
The context engine id for SNMPv3 (10 to 64 hex characters)
--contextname
The context name for SNMPv3 (empty represents the "default" context)
--community2
SNMP community which can be used to switch the context during runtime
--snmpwalk
A file with the output of a snmpwalk (used for simulation)
Use it instead of --hostname
--servertype
The type of the network device: cisco (default). Use it if auto-detection
is not possible
--oids
A list of oids which are downloaded and written to a cache file.
Use it together with --mode oidcache
--offline
The maximum number of seconds since the last update of cache file before
it is considered too old
--mode
A keyword which tells the plugin what to do
hardware-health (Check the status of environmental equipment (fans, temperatures, power))
cpu-load (Check the CPU load of the device)
memory-usage (Check the memory usage of the device)
interface-usage (Check the utilization of interfaces)
interface-errors (Check the error-rate of interfaces (without discards))
interface-discards (Check the discard-rate of interfaces)
interface-status (Check the status of interfaces (oper/admin))
interface-nat-count-sessions (Count the number of nat sessions)
interface-nat-rejects (Count the number of nat sessions rejected due to lack of resources)
list-interfaces (Show the interfaces of the device and update the name cache)
list-interfaces-detail (Show the interfaces of the device and some details)
interface-availability (Show the availability (oper != up) of interfaces)
link-aggregation-availability (Check the percentage of up interfaces in a link aggregation)
list-routes (Check the percentage of up interfaces in a link aggregation)
route-exists (Check if a route exists. (--name is the dest, --name2 check also the next hop))
count-routes (Count the routes. (--name is the dest, --name2 is the hop))
vpn-status (Check the status of vpns (up/down))
create-shinken-service (Create a Shinken service definition)
hsrp-state (Check the state in a HSRP group)
hsrp-failover (Check if a HSRP group's nodes have changed their roles)
list-hsrp-groups (Show the HSRP groups configured on this device)
bgp-peer-status (Check status of BGP peers)
count-bgp-peers (Count the number of BGP peers)
watch-bgp-peers (Watch BGP peers appear and disappear)
list-bgp-peers (Show BGP peers known to this device)
count-bgp-prefixes (Count the number of BGP prefixes (for specific peer with --name))
ospf-neighbor-status (Check status of OSPF neighbors)
list-ospf-neighbors (Show OSPF neighbors)
ha-role (Check the role in a ha group)
svn-status (Check the status of the svn subsystem)
mngmt-status (Check the status of the management subsystem)
fw-policy (Check the installed firewall policy)
fw-connections (Check the number of firewall policy connections)
session-usage (Check the session limits of a load balancer)
security-status (Check if there are security-relevant incidents)
pool-completeness (Check the members of a load balancer pool)
pool-connections (Check the number of connections of a load balancer pool)
pool-complections (Check the members and connections of a load balancer pool)
list-pools (List load balancer pools)
check-licenses (Check the installed licences/keys)
count-users (Count the (connected) users/sessions)
check-config (Check the status of configs (cisco, unsaved config changes))
check-connections (Check the quality of connections)
count-connections (Check the number of connections (-client, -server is possible))
watch-fexes (Check if FEXes appear and disappear (use --lookup))
accesspoint-status (Check the status of access points)
count-accesspoints (Check if the number of access points is within a certain range)
watch-accesspoints (Check if access points appear and disappear (use --lookup))
list-accesspoints (List access points managed by this device)
phone-cm-status (Check if the callmanager is up)
phone-status (Check the number of registered/unregistered/rejected phones)
list-smart-home-devices (List Fritz!DECT 200 plugs managed by this device)
smart-home-device-status (Check if a Fritz!DECT 200 plug is on)
smart-home-device-energy (Show the current power consumption of a Fritz!DECT 200 plug)
smart-home-device-consumption (Show the cumulated power consumption of a Fritz!DECT 200 plug)
uptime (Check the uptime of the device)
walk (Show snmpwalk command with the oids necessary for a simulation)
supportedmibs (Shows the names of the mibs which this devices has implemented (only lausser may run this command))
--regexp
Parameter name/name2/name3 will be interpreted as (perl) regular expression
--warning
The warning threshold
--critical
The critical threshold
--warningx
The extended warning thresholds
e.g. --warningx db_msdb_free_pct=6: to override the threshold for a
specific item
--criticalx
The extended critical thresholds
--units
One of %, B, KB, MB, GB, Bit, KBi, MBi, GBi. (used for e.g. mode interface-usage)
--name
The name of an interface (ifDescr) or pool or ...
--name2
The secondary name of a component
--name3
The tertiary name of a component
--blacklist
Blacklist some (missing/failed) components
--mitigation
The parameter allows you to change a critical error to a warning.
--lookback
The amount of time you want to look back when calculating average rates.
Use it for mode interface-errors or interface-usage. Without --lookback
the time between two runs of check_nwc_health is the base for calculations.
If you want your checkresult to be based for example on the past hour,
use --lookback 3600.
--environment
Add a variable to the plugin's environment
--negate
Emulate the negate plugin. --negate warning=critical --negate unknown=critical
--morphmessage
Modify the final output message
--morphperfdata
The parameter allows you to change performance data labels.
It's a perl regexp and a substitution.
Example: --morphperfdata '(.*)ISATAP(.*)'='$1patasi$2'
--selectedperfdata
The parameter allows you to limit the list of performance data. It's a perl regexp.
Only matching perfdata show up in the output
--report
Can be used to shorten the output
--multiline
Multiline output
--with-mymodules-dyn-dir
Add-on modules for the my-modes will be searched in this directory
--statefilesdir
An alternate directory where the plugin can save files
--isvalidtime
Signals the plugin to return OK if now is not a valid check time
--alias
The alias name of a 64bit-interface (ifAlias)
--ifspeedin
Override the ifspeed oid of an interface (only inbound)
--ifspeedout
Override the ifspeed oid of an interface (only outbound)
--ifspeed
Override the ifspeed oid of an interface
--role
The role of this device in a hsrp group (active/standby/listen)
The only threshold I can think of would be for temperature which I'm not eve nsure if you can specify that. By default it looks like the alert threshold is set for 60, which is a good alert threshold if that's Celsius.
OK, so without any threshold info, say if a fan or psu failed, how would this be shown in nagios ?.....Would it bring up some sort of a red alarm indication ?
The --t 60 is the timeout and not temperature.....I tested it and turned it down to 1 and the request timed out.
Yeah, that's not why I thought the alert was 60 though. We got a warning alarm on temp once and it was like 68 or something and it cleared after a bit when it went back below 60. I could be wrong on that threshold though, that was like a year ago that this happened.
Since it's not our plugin, I can't tell you where it's coming from - mine sets the warning threshold at 65 though even if I try to explicitly set it, furthermore it doesn't set a critical threshold:
[jrdalrymple@localhost libexec]$ ./check_nwc_health --mode hardware-health --hostname <switch1> --community public --warning=15 --critical=20
OK - environmental hardware working fine | 'temp_1005'=43;65;;;
I can't promise that the threshold indicated is even honored - again not our software.
Reading through their bugs and such indicates to me that those thresholds should work, but perhaps it's only for specific hardware. That plugin monitors a zillion different network devices.
vijilants wrote:Now I need to find a plugin for Cisco Memory utilisation !!!
Would you mind opening a new topic for this for the sake of organization? I would like to close this one since the issue has been resolved and the topic has gotten quite long.