Page 5 of 5
Re: Services issues
Posted: Mon Feb 23, 2015 4:56 pm
by jdalrymple
I feel like the issue is getting confused and perhaps the solution is simpler than we're making it out to be. Is the problem still simply that your script returns UNKNOWN when there is a network or snmp timeout and you wish for it to return CRITICAL? If so that's as simple as changing this code block
Code: Select all
if [ ! "$rdatestring" ] ; then
echo "Time difference could not be calculated; no time received."
exit 3
fi
to this:
Code: Select all
if [ ! "$rdatestring" ] ; then
echo "Time difference could not be calculated; no time received."
exit 2
fi
If that's not the desired goal, what is?
Re: Services issues
Posted: Mon Feb 23, 2015 6:21 pm
by imran_khan
Hello,
We have observed that, services of all the servers are going in “UNKNOWN” section instead of “CRITICAL” when server go down.
It should show in “CRITICAL” section not in “UNKNOWN”. I have made the changes in Nagios for all the services and all the services return value 2 expect Time Skew service.
Thanks,
Imran Khan.
Re: Services issues
Posted: Tue Feb 24, 2015 9:48 am
by jdalrymple
When you say time skew service you are referring specifically to your check_snmp_time check attached earlier? Did you make the change I suggested? In case you have not I have attached a modified version (using the adjustment I mentioned earlier) that does return critical when it cannot reach the snmp daemon.
Code: Select all
[jdalrymple@localhost ~]$ ./check_snmp_time -H localhost -C public; echo $?
Timeout: No Response from localhost.
Time difference could not be calculated; no time received.
2
[jdalrymple@localhost ~]$
Re: Services issues
Posted: Tue Feb 24, 2015 11:08 am
by imran_khan
Hello,
As per your suggestion, I have changed the time script usr/local/nagios/libexec/check_snmp_time.
Changed
if [ ! "$rdatestring" ] ; then
echo "Time difference could not be calculated; no time received."
exit 3
fi
to
if [ ! "$rdatestring" ] ; then
echo "Time difference could not be calculated; no time received."
exit 2
fi
When run the check_snmp_time script on localhost it returned value 2 means it is working fine after making changes but it is returning value 3 with Negate script for remote servers.
[root@NAGIOS libexec]# ./check_snmp_time -H localhost -C Community; echo $?
Timeout: No Response from localhost.
Time difference could not be calculated; no time received.
2
[root@NAGIOS ~]# /usr/local/nagios/libexec/negate -u CRITICAL /usr/local/nagios/libexec/check_snmp_time -H Server IP -C Community; echo $?
No data returned from command
3
[root@NAGIOS ~]# /usr/local/nagios/libexec/negate -u CRITICAL /usr/local/nagios/libexec/check_snmp_time -H Server IP -C Community; echo $?
No data returned from command
3
Thanks,
Imran Khan.
Re: Services issues
Posted: Tue Feb 24, 2015 11:51 am
by jdalrymple
The idea is that you should no longer need to use negate to get the results you want. What is it you're trying to achieve with negate at this point?
Simply use the modified check in place without negate. Does that provide the results you want?
Re: Services issues
Posted: Tue Feb 24, 2015 1:24 pm
by imran_khan
Hello,
Got it. There is no need to use Negate for time script.
Thanks for your support.
Thanks,
Imran Khan.