Page 1 of 2

/check_uptime.pl show incorrect uptime.

Posted: Tue Jul 28, 2015 9:13 am
by mkhan12282
Hello everyone

We have recently come across a issue where nagios is reporting wrong value for /check_uptime.pl. When this happens, we believe that a device has gone down (restart) but when checking on the device (mainly cisco) we can see uptime on the device is more than a year. I can't find any relevant article which could assist us in resolving the issue.

You can see that when we run the command on nagios, it retruns wrong value compare to when we check direclty on the end device.
Any advice to fix the issue?
Thank you.
MK

Nagios CLI:
[root@demo libexec]# ./check_uptime.pl -c tiger -l 1 -u 10000 172.16.16.1
OK: UP for 3 days | uptime=3;

Device:
172.16.16.1 uptime is 1 year, 19 weeks, 2 days, 19 hours, 34 minutes
Uptime for this control processor is 1 year, 19 weeks, 2 days, 19 hours, 37 minutes
System returned to ROM by reload
System restarted at 17:59:38 GMT Fri Mar 14 2014
System image file is "flash:packages.conf"
Last reload reason: Reload command



define command{
command_name check_snmp_uptime
command_line $USER1$/check_uptime.pl -c $ARG1$ -l $ARG2$ -u $ARG3$ $HOSTADDRESS$
}

define service {
use active-msc,graphed-service
notification_period 24x7
notification_options w,u,c
service_description sys-uptime
check_command check_snmp_uptime3!tiger!1!10000
hostgroup Voice-Servers-linux
notification_interval 0
}

Re: /check_uptime.pl show incorrect uptime.

Posted: Tue Jul 28, 2015 12:15 pm
by jolson
It's possible that your Cisco device is improperly reporting its uptime. You can verify this by manually running an SMNPget against the appropriate OID.

Code: Select all

snmpget -v2c -c public -mALL 192.168.1.1 1.3.6.1.6.3.10.2.1.3
Replace -vXX with the appropriate version of SNMP you're running, public with the appropriate community string, and 192.168.1.1 with the appropriate IP Address of your Cisco device in question.

Re: /check_uptime.pl show incorrect uptime.

Posted: Wed Jul 29, 2015 6:23 am
by mkhan12282
Hi

OK, so it return the similar result as Nagios did.
[root@ukdcnetmon01 libexec]# snmpget -v2c -c tiger -mALL 172.16.16.1 1.3.6.1.2.1.1.3.0
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (39903247) 4 days, 14:50:32.47

It's seems that it's around this issue/limitation.
sysUpTime is a 32-bit counter and will roll over after 496 days.
https://supportforums.cisco.com/discuss ... ime#573246

Now users suggests to use this, 1.3.6.1.6.3.10.2.1.3.0 which does not rollover, but how can be make nagios to check this snmp mib and not the above? So using the new OID, i am getting this.
SNMP-FRAMEWORK-MIB::snmpEngineTime.0 = INTEGER: 42251161 seconds

So quoting from the above shown link.

sysUpTime is a 32-bit counter and will roll over after 496 days.
But you can poll snmpEngineId (.1.3.6.1.6.3.10.2.1.3) which returns the uptime in seconds and should not roll over for 135 years...

Thanks.

Re: /check_uptime.pl show incorrect uptime.

Posted: Wed Jul 29, 2015 10:40 am
by jolson
mkhan12282,

Try using the attached version of the script (updated for the new OID you suggested) and let me know the results of your testing.

Thanks!

Re: /check_uptime.pl show incorrect uptime.

Posted: Fri Jul 31, 2015 9:42 am
by mkhan12282
Hi

Thanks for the file.
Please see below. It's fetching real up time, but the "Duration" column" still gives not real value. Can we update this too?



Regards
MK

Re: /check_uptime.pl show incorrect uptime.

Posted: Fri Jul 31, 2015 10:53 am
by jolson
Duration is how long it's been since the service state last changed.
Since you installed the new plugin, has the service status of that check changed? It's likely that the duration is displaying properly. Duration does not have to do with the uptime check specifically, and since you recently replaced your check_uptime plugin I assume that the your duration may have been reset due to state changes for that service. Am I missing something here?

Re: /check_uptime.pl show incorrect uptime.

Posted: Tue Aug 04, 2015 7:28 am
by mkhan12282
You are right.

I believe it has fixed the issue, so thank you.
I will go ahead and update plugin file with yours for rest of our nagios servers.

Can you please leave this ticket open for end of this week and close if you don't hear from me?

Regards
MK

Re: /check_uptime.pl show incorrect uptime.

Posted: Tue Aug 04, 2015 9:07 am
by tmcdonald
mkhan12282 wrote:Can you please leave this ticket open for end of this week and close if you don't hear from me?
Sure can!

Re: /check_uptime.pl show incorrect uptime.

Posted: Wed Oct 28, 2015 12:28 pm
by mkhan12282
Hello team

We noticed that after updating the check_uptime.pl (previously provided), nagios shows the uptime value for the nagios host itself. What i mean that say if we search for a device in nagios, then system uptime value is of nagios host but not for the device which we searched.

However, if we use the snmpwalk with the old oid value, we get the correct uptime.

Examples:

This show actual device uptime.
[root@PR-VMNAGIOS01 libexec]# snmpget -v1 -c password -mALL 10.x.x.x 1.3.6.1.2.1.1.3.0
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (77175710) 8 days, 22:22:37.10
[root@PR-VMNAGIOS01 libexec]#
[root@PR-VMNAGIOS01 libexec]#
[root@PR-VMNAGIOS01 libexec]#

This shows nagios host uptime instead of the actual device.
[root@PR-VMNAGIOS01 libexec]# ./check_uptime.pl -c password -l 1 -u 10000 10.x.x.x
Unknown option: u
OK: Linux PR-VMNAGIOS01.xxxxx.internal 2.6.32-431.5.1.el6.x86_64 - up 2 hours 5 minutes
nagios_host.PNG
nagios_host.PNG (4.46 KiB) Viewed 7140 times
Can you please assist us?
Thank you.
MK

Re: /check_uptime.pl show incorrect uptime.

Posted: Wed Oct 28, 2015 5:17 pm
by tmcdonald
Try running that plugin manually as the nagios user with the -v flag for verbose output, and post the results. Also, what is the -u option doing here? I did not see it in the plugin help output.