Re: Creating an alert for host stuck in boot loop
Posted: Mon Mar 02, 2020 5:21 pm
Yes, that's true, but in our specific use case. I don't think upping the max retries would have a detrimental effect, because we're dealing with fast boot loops. The server just keeps rebooting itself over and over within a few minutes of the last reboot.
I'm thinking if we change the check interval to 15 minutes as well as include 10 retry attempts every 1 minute, that should weed out any "normal" reboot like patches. Or, maybe 10 minute check intervals with 7 max retries every 1 minute.
I just don't want a notification every time a server reboots. In my testing the 5-1-5 setup sometimes wasn't flipping back to OK in time to not send the notification, even though the server had already booted up.
I'm thinking if we change the check interval to 15 minutes as well as include 10 retry attempts every 1 minute, that should weed out any "normal" reboot like patches. Or, maybe 10 minute check intervals with 7 max retries every 1 minute.
I just don't want a notification every time a server reboots. In my testing the 5-1-5 setup sometimes wasn't flipping back to OK in time to not send the notification, even though the server had already booted up.