Intermittent alerts from Unix server
Posted: Fri Oct 19, 2018 4:14 pm
Hi Team,
We have configured our Unix monitoring using SNMP. We are running in POC phase and we are seeing frequent alerts from different servers at different time frames. We haven't found any pattern w.r.t. to alerts.
For example we have received following alerts all at same time:
For host :
tg-pxoct is DOWN CRITICAL - Plugin timed out while executing system call
And for each service :
tg-pxoct : Disk Usage is CRITICAL ERROR: Description/Type table : No response from remote host 10.XX.XX.XX.
Host was not down during that time. We didn't find any issues with the host either. Though below are the intervals set, why did we receive alerts right from go and got clear alerts in few minutes. Why didn't it wait for completing the pooling cycle before sending us alerts? If it because, it got timed out instead of fail/success. How do we avoid these issues?
Check Interval : 5
Retry Interval : 1
Max check attempts : 19
We are using same SNMP community string to monitor Nagios and also other monitoring tools. Will that be an issue?
Thanks,
Bhargava
We have configured our Unix monitoring using SNMP. We are running in POC phase and we are seeing frequent alerts from different servers at different time frames. We haven't found any pattern w.r.t. to alerts.
For example we have received following alerts all at same time:
For host :
tg-pxoct is DOWN CRITICAL - Plugin timed out while executing system call
And for each service :
tg-pxoct : Disk Usage is CRITICAL ERROR: Description/Type table : No response from remote host 10.XX.XX.XX.
Host was not down during that time. We didn't find any issues with the host either. Though below are the intervals set, why did we receive alerts right from go and got clear alerts in few minutes. Why didn't it wait for completing the pooling cycle before sending us alerts? If it because, it got timed out instead of fail/success. How do we avoid these issues?
Check Interval : 5
Retry Interval : 1
Max check attempts : 19
We are using same SNMP community string to monitor Nagios and also other monitoring tools. Will that be an issue?
Thanks,
Bhargava