NetApp - SNMP monitoring issue

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

NetApp - SNMP monitoring issue

Post by NMFSTeam »

Not sure if there is anything that can be done about this, but over the last few days, we started receiving alerts from Nagios regarding our NetApp, specifically, the shelf status. We are using a default SNMP check, and all the other checks are working fine (cpu, autosupport status, disks, shelf info, etc.). The only one that is alerting us is "shelf status." We are receiving the error: (No output on stdout) stderr:

Normally, after ten minutes, it reverts back to OK. Then, later in the day, we will receive another alert, only for it to clear itself after ten minutes.

We have checked the NetApp filer, and there does not appear to be any issues with it. Could it just be network issues? Thanks.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: NetApp - SNMP monitoring issue

Post by ssax »

It depends on what plugin you are using, please SSH into the XI server and run the check command from the CLI as the nagios user:

Code: Select all

su - nagios
/usr/local/nagios/libexec/YOURFULLCHECKCOMMAND -with -arguments
Then send us the entire output.

If you need help deciphering what that check command is, please PM me a copy of your profile, you can download it from Admin > System Profile > Download Profile.

If you're unable to generate the the profile through the web interface, please try generating it from the command line by running these commands as root:

Code: Select all

rm -rf /usr/local/nagiosxi/var/components/profile*​​
/usr/local/nagiosxi/html/includes/components/profile/getprofile.sh SUPPORT
Then send me the resulting /usr/local/nagiosxi/var/components/profile.zip​ file.​

If the profile script fails, please include the ENTIRE output.
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

Re: NetApp - SNMP monitoring issue

Post by NMFSTeam »

I am unable to determine the check that is being used. I have sent a PM with the profile. Thank you.
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

Re: NetApp - SNMP monitoring issue

Post by NMFSTeam »

Using the Core Config Manager, I was able to determine the check being used. (of course now, it's working fine)

Code: Select all

[nagios@nagios01 libexec]$ ./check-netapp-ng.pl -H 192.168.0.25 -C snmpstring -T SHELF
VoltOverFail->None VoltUnderFail->None TempUnderFail->None PsFail->None TempOver->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None FanFail->None TempUnderWarn->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
OK: SHELF ok | shelf=0
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: NetApp - SNMP monitoring issue

Post by ssax »

Please attach this file:

Code: Select all

/usr/local/nagios/libexec/check-netapp-ng.pl
More than likely you need to set a timeout on it (or increase some other timeout along the path), I'll investigate that while you send me the file so that I can look at that specific version.

But generally, if load gets high on a system SNMP data is the first thing to get dropped (or gets lower priority) so adjusting your max_check_attempts to account for these situations can help you alleviate that as an issue but it usually just takes increasing a timeout somewhere (SNMP can take a bit to respond if load is high on a system).


Additionally, what is the output of this command?

Code: Select all

time /usr/local/nagios/libexec/check-netapp-ng.pl -H 192.168.0.25 -C snmpstring -T SHELF
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

Re: NetApp - SNMP monitoring issue

Post by NMFSTeam »

Files are being sent via PM now. I actually found two files, perhaps the other one would work better?

Here is the output of the command:

Code: Select all

[root@nagios01 nagios]# time /usr/local/nagios/libexec/check-netapp-ng.pl -H 192.168.0.25 -C snmpstring -T SHELF
VoltOverFail->None VoltUnderFail->None TempUnderFail->None PsFail->None TempOver->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None FanFail->None TempUnderWarn->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
OK: SHELF ok | shelf=0

real    0m8.020s
user    0m0.180s
sys     0m0.017s
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

Re: NetApp - SNMP monitoring issue

Post by NMFSTeam »

The issue is happening NOW, so I went ahead and ran the commands again...

Code: Select all

[root@nagios01 nagios]# /usr/local/nagios/libexec/check-netapp-ng.pl -H 192.168.0.25 -C snmpstring -T SHELF
VoltOverFail->None VoltUnderFail->None TempUnderFail->None PsFail->None TempOver->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None FanFail->None TempUnderWarn->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
Alarm clock
[root@nagios01 nagios]# time /usr/local/nagios/libexec/check-netapp-ng.pl -H 192.168.0.25 -C snmpstring -T SHELF
VoltOverFail->None VoltUnderFail->None TempUnderFail->None PsFail->None TempOver->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None FanFail->None TempUnderWarn->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
VoltOverFail->None VoltUnderFail->None TempUnderFail->None TempOver->None PsFail->None ElectFail->None VoltUnderWarn->None VoltOverWarn->None TempUnderWarn->None FanFail->None TempOverFail->None
Alarm clock

real    0m15.066s
user    0m0.159s
sys     0m0.011s
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: NetApp - SNMP monitoring issue

Post by ssax »

Please add this to the top of the script (after the 1st line):

Code: Select all

my $TIMEOUT = 60;
Then see if that resolves your issue related to this.
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

Re: NetApp - SNMP monitoring issue

Post by NMFSTeam »

I implemented the change you suggested and that seems to have fixed things. No more errors from Nagios. Thank you very much for your assistance.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: NetApp - SNMP monitoring issue

Post by mbellerue »

Glad to hear it's working! Closing thread.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked