RTA NAN Host 100% & Socket Timeout Error
Posted: Tue Jul 22, 2014 1:31 am
We are facing a problem with Nagios Implementation which is almost a year old. In order to fix the problem we tried to upgrade the system to Nagios XI 2012R2.9 but the problem still persists.
Problem Description : We have around total of 256 hosts configured for monitoring. Some of them are agent based (Windows & Linux) & some are monitored using SNMP. We are facing problem Intermittently with Windows and Linux hosts and we get the following error which in turn produce another error in the Nagios monitoring console. Below are the errors for your reference.
First error that comes is Critical - [IP Address] RTA NAN Lost 100% and disappears after sometime and sometimes it takes long. Where as when this is the state in Nagios console, the machine is very much reachable from the network as well as from Nagios host console.
Second error we receive regularly is CRITICAL Socket Timeout after 10 Seconds.
Note : I would like to highlight again that errors are intermittent and machines are very much reachable from the network and having no problem. Screen shots above are just some examples but we are facing the same problem with almost all the servers.
Problem Description : We have around total of 256 hosts configured for monitoring. Some of them are agent based (Windows & Linux) & some are monitored using SNMP. We are facing problem Intermittently with Windows and Linux hosts and we get the following error which in turn produce another error in the Nagios monitoring console. Below are the errors for your reference.
First error that comes is Critical - [IP Address] RTA NAN Lost 100% and disappears after sometime and sometimes it takes long. Where as when this is the state in Nagios console, the machine is very much reachable from the network as well as from Nagios host console.
Second error we receive regularly is CRITICAL Socket Timeout after 10 Seconds.
Note : I would like to highlight again that errors are intermittent and machines are very much reachable from the network and having no problem. Screen shots above are just some examples but we are facing the same problem with almost all the servers.