Can't ping monitored servers
Posted: Thu Nov 08, 2018 4:31 am
Hi all,
It's not exactly a problem with nagios itself but I'm wondering if anyone else has come across a similar issue while using Nagios. Basically, there is a Centos 7 virtual machine hosted on a ESXI host. This host has an internal IP address and monitors a bunch of servers. Nearly every day and can be multiple times a day - We start getting email alerts that services have gone down. When we jump onto the nagios box we have found that it is no longer able to ping servers that it monitors. This includes hostnames and IP addresses - The servers also have public IP addresses but are hosted within the same DC/network. The pings just time out but we are able to ping external sites such as google.com or bbc.co.uk.
When we issue a network restart - Everything starts working again until it happens again. We have checked the network configuration and all looks ok. We have tried a different 10.X.X.X IP address to see if that makes any difference, same happens. Hell we have even moved it to a different ESXI host to see if it makes a difference but it doesn't.
Anyone had anything similar or have any ideas on what could be causing it?
It's not exactly a problem with nagios itself but I'm wondering if anyone else has come across a similar issue while using Nagios. Basically, there is a Centos 7 virtual machine hosted on a ESXI host. This host has an internal IP address and monitors a bunch of servers. Nearly every day and can be multiple times a day - We start getting email alerts that services have gone down. When we jump onto the nagios box we have found that it is no longer able to ping servers that it monitors. This includes hostnames and IP addresses - The servers also have public IP addresses but are hosted within the same DC/network. The pings just time out but we are able to ping external sites such as google.com or bbc.co.uk.
When we issue a network restart - Everything starts working again until it happens again. We have checked the network configuration and all looks ok. We have tried a different 10.X.X.X IP address to see if that makes any difference, same happens. Hell we have even moved it to a different ESXI host to see if it makes a difference but it doesn't.
Anyone had anything similar or have any ideas on what could be causing it?