NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
ebardellidoxee
Posts: 6
Joined: Wed Jul 10, 2013 6:00 am

NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by ebardellidoxee »

|||| Nagios 3.2.3 ||||


Hi everybody
i'm just trying to figure out how to fix this:

Our Netapp Storage is not reliable on SNMP, missing lots of checks due to timeout (already engaged netapp's support, but with very low expectations on their fixing it ;-)

Code: Select all

[1373448417] SERVICE ALERT: NETAPPSAN ;check_manu_test;UNKNOWN;SOFT;1;NAF UNKNOWN - Timeout - no SNMP answer from 192.168....
The timeout reports UNKNOWN STATE (as defined by service_check_timeout_state=u) and when a volume is already in HARD WARNING / CRITICAL STATE, this happened to be a HARD STATE CHANGE and a notification is correctly sent out. Howerver, this happens constantly messing up good notifications

Code: Select all

[1373449127] SERVICE ALERT: NETAPPSAN ;check_manu_test;CRITICAL;HARD;5;NAF CRITICAL - 1 CRITICAL: vol_data:volmanutest, 2 OK: vol_snap:volmanutest vol_files:volmanutest
[1373449127] SERVICE NOTIFICATION: ebardelli-email;NETAPPSAN ;check_manu_test;CRITICAL;notify-service-by-email;NAF CRITICAL - 1 CRITICAL: vol_data:volmanutest, 2 OK: vol_snap:volmanutest vol_files:volmanutest
[1373449457] SERVICE ALERT: NETAPPSAN ;check_manu_test;UNKNOWN;HARD;5;(Service Check Timed Out)
[1373450337] SERVICE ALERT: NETAPPSAN ;check_manu_test;CRITICAL;HARD;5;NAF CRITICAL - 1 CRITICAL: vol_data:volmanutest, 2 OK: vol_snap:volmanutest vol_files:volmanutest
[1373450337] SERVICE NOTIFICATION: ebardelli-email;NETAPPSAN ;check_manu_test;CRITICAL;notify-service-by-email;NAF CRITICAL - 1 CRITICAL: vol_data:volmanutest, 2 OK: vol_snap:volmanutest vol_files:volmanutest
Would be nice if timeouts < N do not generate a HARD STATE CHANGE, while timeouts > N would mean something more serious and it's fine if it pops a notification.

I don't think this is recognizable as flapping...is it?
Any idea?

Thanks
+Emanuele
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by sreinhardt »

What plugin are you using? Is it possible to extend the timeout via this plugin?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by abrist »

You could increase retry amounts. Otherwise you may want to custom script a plugin to be a bit more forgiving with the timeouts.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
ebardellidoxee
Posts: 6
Joined: Wed Jul 10, 2013 6:00 am

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by ebardellidoxee »

I'm using check_naf.py by team(ix)

define service {
service_description check_manu_test
check_command check_naf_volume_manu!/vol/volmanutest!80%!90%
host_name NETAPPSAN
check_period 24x7
notification_period 24x7
contact_groups +admins-email
max_check_attempts 5
check_interval 5
retry_interval 1
notification_interval 0
notification_options w,c,r,f,s
notifications_enabled 1
event_handler_enabled 0
}




1) modify the plugin? I'll go on studying python...

2) increase timeout (now: service_check_timeout=60), even if when the storage doesn't answer SNMP when it is busy, and usually it stays busy for much longer. What a fair timeout could be? 5 / 10 minutes? Isn't it to much?

3) retry amounts : is it the retry_interval parameter?

Thanks!!!
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by slansing »

You should be able to run the plugin with a "-h" flag from the command line to see usage options, plugins which follow our guidelines have a timeout flag you can add to the command to increase it's timeout integer.
ebardellidoxee
Posts: 6
Joined: Wed Jul 10, 2013 6:00 am

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by ebardellidoxee »

### ./check_naf.py -h
Usage: check_naf.py [options]

Monitoring NetApp(tm) FAS systems

Options:
--version show program's version number and exit
-h, --help show this help message and exit
-H HOST Host to check
-P 1 SNMP protocol version
-C public SNMP v1/v2c community OR SNMP v3 quadruple
--snmpcmdlinepath=/usr/bin/
Path to "snmpget" and "snmpwalk"
--nonetsnmp Do not use NET-SNMP python bindings
--separator=, Separator for check/target/warn/crit
--subseparator=+ Separator for multiple checks or targets
--check=CHECK OBSOLETE - use new syntax!
--target=TARGET OBSOLETE - use new syntax!
-w WARN OBSOLETE - use new syntax!
-c CRIT OBSOLETE - use new syntax!
--snmpwalk=SNMPWALKOID
DEBUG: "list" OIDs or SNMPWALK it
-v, --verbose Verbosity, more for more ;-)
### ./check_naf.py --version
0.9


I don't see the timeout flag
I checked and it looks like the last version, there's no newer.
ebardellidoxee
Posts: 6
Joined: Wed Jul 10, 2013 6:00 am

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by ebardellidoxee »

maybe you know some other netapp filer plugins i could try?
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by abrist »

Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
ebardellidoxee
Posts: 6
Joined: Wed Jul 10, 2013 6:00 am

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by ebardellidoxee »

Thanks.
I’ll try http://exchange.nagios.org/directory/Pl ... NG/details (the best reviewed) to exclude this is a plugin problem.

Looking further, is there any way to force nagios to ignore timeouts for specified services?
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NETAPP + SNMP TIMEOUTS + UNNECESSARY NOTIFICATIONS

Post by slansing »

Well, if the service has a timeout function set, you could remove it. Otherwise the nagios.cfg file houses a global timeout rate. This is changeable but is a nice precaution against runaway checks.
Locked