How do I improve alert latency?
Posted: Fri Apr 26, 2024 1:08 pm
I need some help with alerts on Nagios Core 4.4.14. My goal is to get the least latency (because VMs take so little time to recover after a reboot) and minimal repetition of failure. I am turning off networking on the Linux client to simulate an outage, and with these settings, it takes Nagios core >9 minutes to alert that a system is down.
Here is what I see in the log;
When I turn networking back on, the recovery notification is very quick. Like less than a minute.
Based on my settings this may be a normal response, but what settings can I use to get closer to my goal?
Code: Select all
max_check_attempts 2
check_interval 1
retry_interval 1
notification_interval 0
Code: Select all
[1714153603] SERVICE ALERT: watto;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out
[1714153606] HOST ALERT: watto;DOWN;SOFT;1;CRITICAL - Host Unreachable (172.16.11.111)
[1714153667] HOST ALERT: watto;DOWN;SOFT;2;CRITICAL - Host Unreachable (172.16.11.111)
[1714153667] SERVICE ALERT: watto;PING;CRITICAL;HARD;2;CRITICAL - Host Unreachable (172.16.11.111)
[1714153729] HOST ALERT: watto;DOWN;SOFT;3;CRITICAL - Host Unreachable (172.16.11.111)
[1714153789] HOST ALERT: watto;DOWN;SOFT;4;CRITICAL - Host Unreachable (172.16.11.111)
[1714153849] HOST ALERT: watto;DOWN;SOFT;5;CRITICAL - Host Unreachable (172.16.11.111)
[1714153909] HOST ALERT: watto;DOWN;SOFT;6;CRITICAL - Host Unreachable (172.16.11.111)
[1714153969] HOST ALERT: watto;DOWN;SOFT;7;CRITICAL - Host Unreachable (172.16.11.111)
[1714154029] HOST ALERT: watto;DOWN;SOFT;8;CRITICAL - Host Unreachable (172.16.11.111)
[1714154089] HOST ALERT: watto;DOWN;SOFT;9;CRITICAL - Host Unreachable (172.16.11.111)
[1714154149] HOST NOTIFICATION: nagiosadmin;watto;DOWN;notify-host-by-email;CRITICAL - Host Unreachable ( 172.16.11.111)
[1714154149] HOST NOTIFICATION: slack;watto;DOWN;notify-host-by-slack;CRITICAL - Host Unreachable (172.16 .11.111)Based on my settings this may be a normal response, but what settings can I use to get closer to my goal?