i am getting Error General Time out and Nagios Time out for multiple nodes.
Please help me out to understand the root cause for the same.
[nagios@nagiosxi ~]$ grep -i 10.10.172.40 /etc/hosts
10.10.172.40 HCSMS1
[nagios@nagiosxi ~]$ ssh HCSMS1
Last login: Thu Jun 14 01:53:37 2018 from 10.10.164.52
[nagios@HCSMS1 ~]$ logout
Connection to HCSMS1 closed.
we are getting traps from the HCSMS1.
General Time Out (alarm signal)
-
ericssonvietnam
- Posts: 239
- Joined: Mon Jun 27, 2016 11:05 pm
General Time Out (alarm signal)
You do not have the required permissions to view the files attached to this post.
Re: General Time Out (alarm signal)
The errors that you are describing are usually caused by intermittent network connection losses between the Nagios server and the remote server that is being monitored.
The remote server may be up all of the time but if the Nagios server cannot connect to the remote server, the checks will generate the timeout message or a 255 message.
Can you describe how the network is setup between the sites?
Lets try this command to ping one of the remote server continuously and output the data to a file.
Replace hostname with one of the servers.
Then, if you get the 255 error or a timeout error, stop the ping and check the ping.txt file to see if there was a connection issue between the servers.
The remote server may be up all of the time but if the Nagios server cannot connect to the remote server, the checks will generate the timeout message or a 255 message.
Can you describe how the network is setup between the sites?
Lets try this command to ping one of the remote server continuously and output the data to a file.
Replace hostname with one of the servers.
Code: Select all
ping hostname | perl -nle 'print scalar(localtime), " ", $_' | tee -a ping.txtBe sure to check out our Knowledgebase for helpful articles and solutions!