Page 1 of 1

Switches Flapping on Another Network

Posted: Wed Apr 26, 2017 8:16 pm
by dkirk
Hi,

I've compiled Nagios Core v4.3.1 on a Centos 7 virtual machine. It is running nicely, but I have a bunch of Cisco 2960 switches on a different subnet that are reported as flapping all the time. There are a firewall and a router between Nagios and the switches. Switches on the local network don't have this issue.

They are all just being checked with the check_ping command.

This is what I see in nagios.log.

Code: Select all

[1493253969] HOST ALERT: sw57;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%
I've run tcpdump on the Nagios server and when I see a switch go down in nagios.log I check what tcpdump says. Here is an example.

Code: Select all

12:09:48.358761 IP 192.168.0.21 > 10.0.0.57: ICMP echo request, id 8323, seq 1, length 64
12:09:48.359402 IP 192.168.0.254 > 192.168.0.21: ICMP redirect 10.0.0.57 to host 192.168.3.3, length 36
12:09:48.359532 IP 10.0.0.57 > 192.168.0.21: ICMP echo reply, id 8323, seq 1, length 64
When the switch is re-checked, this is the result.

Code: Select all

12:10:48.364797 IP 192.168.0.21 > 10.0.0.57: ICMP echo request, id 8436, seq 1, length 64
12:10:48.365588 IP 10.0.0.57 > 192.168.0.21: ICMP echo reply, id 8436, seq 1, length 64
12:10:49.366350 IP 192.168.0.21 > 10.0.0.57: ICMP echo request, id 8436, seq 2, length 64
12:10:49.367092 IP 10.0.0.57 > 192.168.0.21: ICMP echo reply, id 8436, seq 2, length 64
12:10:50.367157 IP 192.168.0.21 > 10.0.0.57: ICMP echo request, id 8436, seq 3, length 64
12:10:50.367850 IP 10.0.0.57 > 192.168.0.21: ICMP echo reply, id 8436, seq 3, length 64
12:10:51.368594 IP 192.168.0.21 > 10.0.0.57: ICMP echo request, id 8436, seq 4, length 64
12:10:51.369307 IP 10.0.0.57 > 192.168.0.21: ICMP echo reply, id 8436, seq 4, length 64
12:10:52.369112 IP 192.168.0.21 > 10.0.0.57: ICMP echo request, id 8436, seq 5, length 64
12:10:52.369845 IP 10.0.0.57 > 192.168.0.21: ICMP echo reply, id 8436, seq 5, length 64
Now I'm aware that this probably isn't an issue with Nagios, but has anyone else had this problem before and managed to find a way around it?


Thanks

David

Re: Switches Flapping on Another Network

Posted: Thu Apr 27, 2017 11:43 am
by tgriep
I did a little searching about ICMP redirects and it could be a routing issue on those devices so you may want to check then and see if the settings are valid for your equipment.
https://en.wikipedia.org/wiki/Internet_ ... l#Redirect

Other that that, you can edit the check commands and increase the timeout by adding the -t <Seconds> option to the command or increase the threshold levels.

Re: Switches Flapping on Another Network

Posted: Thu Apr 27, 2017 7:58 pm
by dkirk
Hi,

Thanks for your reply.

I think I may have resolved it. I changed from check_ping to check_icmp and I haven't had any flapping since. I will continue to monitor it.


Thanks

David

Re: Switches Flapping on Another Network

Posted: Fri Apr 28, 2017 9:15 am
by cdienger
Has there been any more flapping? If not, do you think it'd be okay to consider this resolved or would you like to monitor it longer?

Re: Switches Flapping on Another Network

Posted: Sun Apr 30, 2017 5:40 pm
by dkirk
Hi,

There have been no further flapping notifications. I think we can call this resolved.


Thanks

David