Page 1 of 1

Nagios is checking one service twice?

Posted: Tue Nov 25, 2014 7:12 pm
by pndmedia
Hi all, hoping someone can help.

For one of my services, Nagios seems to be checking twice every minute, even though it is set to check interval 1 and retry 1

I have a service that goes down 4 times an hour, for 4 minutes (this is meant to happen)
So I have set Nagios to check every minute, retry every minute, and notify me after 6 minutes as it should be back up by then. This has been working perfectly fine for the past 6 months.
The last two days, Nagios has been sending me a text after 3 minutes, I watched the online and noticed it counting a lot faster then every 1 minute.

Nothing has changed, I have not even logged into the nagios machine for the past month because its all been smooth.

Any advice would be much appreciated


define service{
host_name LOGGER
use generic-service
check_period networking
normal_check_interval 1
retry_check_interval 1
max_check_attempts 6
service_description Syndication
check_command check_netPRG
}

Cheers
Pete

Re: Nagios is checking one service twice?

Posted: Wed Nov 26, 2014 11:36 am
by abrist
What version of nagios are you running?

Re: Nagios is checking one service twice?

Posted: Wed Nov 26, 2014 12:02 pm
by eloyd
This is my favorite quote in all of IT-dom:
Nothing has changed...
Something has changed or else it would be behaving the way it always has. It might be that the check was never properly configured in the first place, and some error condition is triggering the check to do something unintended like notify twice. Some thoughts/questions:

One thing you don't show is how you get that text message and what it contains. Can you please include the part of commands.cfg that has your SMS send command in it, and if it is a script (as opposed to an email) can you include the script? Feel fee to block out any passwords, etc. And lastly, can you include your contacts.cfg file as well. Since your service is just including generic-service, do you have any contactgroups specified on the generic service? Is it possible that your contacts got changed and you're seeing two messages for the same event because it's sending you messages via two different addresses?

Re: Nagios is checking one service twice?

Posted: Wed Nov 26, 2014 1:07 pm
by abrist
eloyd wrote:Nothing has changed...
Yep, something has, even if it was not due to a direct user action.
@OP: Could you also post the info that eloyd has asked for?

Re: Nagios is checking one service twice?

Posted: Wed Nov 26, 2014 8:16 pm
by pndmedia
Not sure if my previous post has disappeared, or if its just waiting to be moderated because I am a new member, but just in case I shall try again.

Thanks for the reply all,

Yes you are correct, nothing has changed as a result of direct user action, I am the only one with access to this server and have not logged in for a while.

The version of Nagios 3.4.1

Just to clarify, I am only receiving 1 text / email, I am just receiving them a lot sooner then expected. After only 3 minutes instead of 6.

As requested, I have put the code for the Contact, Config and included a Nagios Log.

I really appreciate your time helping with this issue, I am not a Nagios Admin by trade and am new to Linux, this is a massive learning curve for me and these forums have been so helpful installing and setting up the basic of Nagios.
Cheers
Pete

Notifications

define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n$
}


define command{
command_name notify-service-by-sms
command_line /usr/bin/printf "%b" "* Nagios *\nType: $NOTIFICATIONTYPE$\n\nService: $SERVICE$
}



Contacts config

define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members techs
}


define contact{
contact_name techs
alias techs
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,r
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
email engineering@***.com.au
}

define contact{
contact_name oncall
alias oncall
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,r
service_notification_commands notify-service-by-sms
host_notification_commands notify-host-by-sms
pager 0430 *** ***
}



Nagios Log

[1416964685] SERVICE ALERT: %HOST%;%SERVICE%;CRITICAL;SOFT;1; Fail
[1416964685] SERVICE ALERT: %HOST%;%SERVICE%;CRITICAL;SOFT;2; Fail
[1416964745] SERVICE ALERT: %HOST%;%SERVICE%;CRITICAL;SOFT;3; Fail
[1416964745] SERVICE ALERT: %HOST%;%SERVICE%;CRITICAL;SOFT;4; Fail
[1416964805] SERVICE ALERT: %HOST%;%SERVICE%;CRITICAL;SOFT;5; Fail
[1416964805] SERVICE ALERT: %HOST%;%SERVICE%;CRITICAL;HARD;6; Fail

Re: Nagios is checking one service twice?

Posted: Mon Dec 01, 2014 12:06 pm
by tmcdonald
Can you show us the check_netPRG command definition, and if it uses a custom plugin can you share that as well?

Re: Nagios is checking one service twice?

Posted: Mon Dec 01, 2014 8:00 pm
by pndmedia
Thanks for the reply,

it just uses snmpwalk,

/usr/bin/snmpwalk -v 1 -c public -On IP-ADDRESS .1.3.6.1.4.1.38558.2.1.1.1.1.3.6 | grep "INTEGER: 2" >>/dev/null
if [ $? -ne 0 ]; then
echo "Audio OK"
RETVAL=0
else
echo "Audio Fail"
RETVAL=2
fi
exit $RETVAL

Re: Nagios is checking one service twice?

Posted: Mon Dec 01, 2014 9:06 pm
by Box293
tmcdonald wrote:Can you show us the check_netPRG command definition

Re: Nagios is checking one service twice?

Posted: Sun Dec 14, 2014 6:38 pm
by pndmedia
Box293 wrote:
tmcdonald wrote:Can you show us the check_netPRG command definition

I have put the check_netPRG command above, it is not a custom plugin, it just uses a snmp walk to get an Integer value which tells me if there is audio there or not.

This is still a problem and any help would be much appreciated.

Cheers
Pete

Re: Nagios is checking one service twice?

Posted: Sun Dec 14, 2014 6:54 pm
by Box293
Your service is defined as follows:

Code: Select all

check_command check_netPRG
In your commands.cfg file, show us the command definition for check_netPRG