Uptime Alarming
Posted: Tue Aug 08, 2017 1:54 pm
Hello all.
We're currently looking to implement uptime alarming on a number of different devices. As mentioned in numerous other posts, the problem with the SysUpTime OID is that it loops after ~495 days. We are a telco, so a lot of our networking gear isn't normally rebooted/upgraded in that low of a time frame.
I've done an OID search, and a lot of our devices don't implement the snmpEngineTime (.1.3.6.1.6.3.10.2.1.3) OID, which is used as a SysUpTime alternative that doesn't loop for many years. I'm wondering what other people do as a solution for something like this? Wondering if maybe anyone implements the SysUpTime OID in a script with some sort of logic that says if my previous value was near the max, then I won't alarm. I'm not much of a linux guy, so don't really have the know how to build something like this from scratch.
We'd like to avoid some type of telnet/ssh script that logs into each device to check uptime, as that could get fairly intensive pretty quick. Our previously alarming platform would buckle under the pressure from this.
Thoughts? Thanks!
We're currently looking to implement uptime alarming on a number of different devices. As mentioned in numerous other posts, the problem with the SysUpTime OID is that it loops after ~495 days. We are a telco, so a lot of our networking gear isn't normally rebooted/upgraded in that low of a time frame.
I've done an OID search, and a lot of our devices don't implement the snmpEngineTime (.1.3.6.1.6.3.10.2.1.3) OID, which is used as a SysUpTime alternative that doesn't loop for many years. I'm wondering what other people do as a solution for something like this? Wondering if maybe anyone implements the SysUpTime OID in a script with some sort of logic that says if my previous value was near the max, then I won't alarm. I'm not much of a linux guy, so don't really have the know how to build something like this from scratch.
We'd like to avoid some type of telnet/ssh script that logs into each device to check uptime, as that could get fairly intensive pretty quick. Our previously alarming platform would buckle under the pressure from this.
Thoughts? Thanks!