Page 1 of 2
check_ifoperstatnag - No info being retrieved
Posted: Wed Feb 25, 2015 2:28 pm
by brdr
Hi,
We are using Nagios XI 2014R2.3
I literally get hundreds (per day) of these messages below for port status checks. The associated port check on bandwidth does NOT have this issue. Further, this issue only comes up on the 1st attempt of max_attempts. The second attempt always works. I can run the check_ifoperstatnag from command line and never see this error....
Any idea why we are getting this error? I looked at the boards and didn't see a solution. Thanks.
[1424890841] SERVICE ALERT: a1.bdfrma01;netapp controller 1 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[1424890990] SERVICE ALERT: a2.bstnma01;san-port-channel-2 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[1424890990] SERVICE ALERT: a3.bstnma01;fc1/28 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[1424891001] SERVICE ALERT: a4.bstnma01;mgmt0 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[1424891001] SERVICE ALERT: a8.bstnma01;fc1/32 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[1424891001] SERVICE ALERT: a7.bstnma01;san-port-channel-2 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[1424891001] SERVICE ALERT: a2.bstnma01;fc1/28 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[1424891011] SERVICE ALERT: a1.bdfrma01;Member of SAN-Port-Channel to MDS-01 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
Re: check_ifoperstatnag - No info being retrieved
Posted: Wed Feb 25, 2015 2:50 pm
by abrist
It could be that the initial snmp check times out and that the info is cached on the device temporarily causing the second check to return much faster. Try running the check from the cli with "time" in order to time a cold check. If it takes longer than the timeout assumes, but returns faster on the second attempt, this is most likely the issue and an increase in timeouts should resolve it for you.
Re: check_ifoperstatnag - No info being retrieved
Posted: Thu Feb 26, 2015 9:06 am
by brdr
I can see the timeout in XI for the snmpwalk in the check_ifoperstatnag script, as well as seeing the timeout sometimes on the subsequent snmpget. What timeout value is does this script use? Does it use max_execution_time from php.ini
I picked one service check from yesterday morning that illustrates the UNKNOWN behavior. This port status service check is set to check every 2 minutes. It appears from the nagios.log that something is happening about every hour that forces the check to timeout. Where is this cache?
[Wed Feb 25 00:23:51 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 00:24:51 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
[Wed Feb 25 01:43:31 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 01:44:21 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
[Wed Feb 25 02:43:31 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 02:44:12 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
[Wed Feb 25 03:43:21 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 03:44:12 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
[Wed Feb 25 04:33:31 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 04:34:12 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
[Wed Feb 25 05:33:12 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 05:34:01 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
[Wed Feb 25 06:13:31 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 06:14:22 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
[Wed Feb 25 07:13:21 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 07:14:12 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
[Wed Feb 25 08:13:21 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;UNKNOWN;SOFT;1;UNKNOWN - No info is being retrieved.
[Wed Feb 25 08:14:12 2015] SERVICE ALERT: x.bstnma01;fc1/30 Status;OK;SOFT;2;OK - Interface fc1/30 (index 16896000) is up.
Re: check_ifoperstatnag - No info being retrieved
Posted: Thu Feb 26, 2015 9:27 am
by scottwilkerson
Can you confirm which version of the Switch / Router Wizard you are running
Admin -> Manage Config Wizards
Re: check_ifoperstatnag - No info being retrieved
Posted: Thu Feb 26, 2015 9:40 am
by brdr
Version: 2.1.5
Re: check_ifoperstatnag - No info being retrieved
Posted: Thu Feb 26, 2015 1:32 pm
by lmiltchev
Can you show us the actual command that you are running from the command line, along with the output of it? (Hide sensitive info)
Example:
Code: Select all
/usr/local/nagios/libexec/check_ifoperstatnag 13 -v3 -u <username> -A <auth protocol passphrase> -x DES -X <privacy protocol passphrase> -a MD5 -l authPriv <ip address>
OK - Interface Adaptive (index 13) is up.
Re: check_ifoperstatnag - No info being retrieved
Posted: Thu Feb 26, 2015 2:39 pm
by brdr
/usr/local/nagios/libexec/check_ifoperstatnag 16887808 -v3 -u xxxx -A xxxx -a MD5 -l authNoPriv x.x.x.x
OK - Interface fc1/28 (index 16887808) is up.
Please lemme know if you need more info.
Re: check_ifoperstatnag - No info being retrieved
Posted: Thu Feb 26, 2015 4:09 pm
by lmiltchev
The command seems correct. As this is an intermittent issue, it is possible that your device fails to respond in a timely fashion. I would recommend modifying your check command (in the CCM) for the problem interfaces by adding a "-t" flag (timeout value). You can start with "-t 15" for example, and increase the value if needed. Let us know if this fixed your issue.
Re: check_ifoperstatnag - No info being retrieved
Posted: Thu Feb 26, 2015 4:35 pm
by brdr
Sure. I will pick a couple offenders and set timeout and see if this fixes, and circle back with ya'. Thx.
Re: check_ifoperstatnag - No info being retrieved
Posted: Thu Feb 26, 2015 5:03 pm
by cmerchant
Let us know how that works out. Thanks.