Page 1 of 3

pls, have a look at my configs

Posted: Thu Sep 24, 2015 11:42 am
by vvz
Hi!
I'm using Nagios Core 3.5.1 for more than 2 years. Everything looks fine.
A few days ago I configured some additional passive (NSCA) checks to get root partition amount available for host " vnode2-concert-site".

This is Warning config for the service:
define service{
hostgroup_name root-partition-status-passive
service_description root partition amount
check_command run-nsca-script
max_check_attempts 1
check_interval 1
retry_interval 1
active_checks_enabled 0
passive_checks_enabled 1
check_period 24x7
contacts pagerduty-warning
notification_interval 30
notification_period 24x7
notification_options w
notifications_enabled 1
check_freshness 0
freshness_threshold 60
flap_detection_enabled 0
is_volatile 0
}
This is contacts.cfg
define contact {
contact_name pagerduty-critical
alias pagerduty-critical
host_notifications_enabled 1
service_notifications_enabled 1
host_notification_period 24x7
service_notification_period 24x7
host_notification_options d
service_notification_options c
host_notification_commands notify-host-by-pagerduty
service_notification_commands notify-service-by-pagerduty
pager xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
}

define contact {
contact_name pagerduty-warning
alias pagerduty-warning
host_notifications_enabled 1
service_notifications_enabled 1
host_notification_period 24x7
service_notification_period 24x7
host_notification_options u
service_notification_options w,u
host_notification_commands notify-host-by-pagerduty
service_notification_commands notify-service-by-pagerduty
pager xxxxxxxxxxxxxxxxxxxxxxxxxxxx
}
this is hostgroup file
define hostgroup {
hostgroup_name root-partition-status-passive
alias machines with root partition amount passive checks
members probe2-condor-site, probe2-concert-site, dbnode1-concert-site,vnode3-concert-site, vnode2-concert-site
}
this is example of log lines on nagios server
Sep 24 12:38:12 callme-crt-vnode1 nagios: PASSIVE SERVICE CHECK: vnode2-concert-site;root partition amount;1;WARNING. root partition Size=519G Available=85G Used%=83%
Sep 24 12:39:03 callme-crt-vnode1 nsca[10047]: SERVICE CHECK -> Host Name: 'vnode2-concert-site', Service Description: 'root partition amount', Return Code: '1', Output: 'WARNING. root partition Size=519G Available=84G Used%=83%'
So, as you can see passive checks result was sent to server successfully.
I have indication (yellow line) on my web-interface.
But I don't see SERVICE_NOTIFICATION lines in my logs and as a result no phone(pager) notifications.
What else should I check?
Strange enough all other passive checks are working just fine.
Thank you.

Re: pls, have a look at my configs

Posted: Thu Sep 24, 2015 5:15 pm
by Box293
For this passive service in core, if you "Send custom service notification" from the Commands list, do you receive a pager notification?

Re: pls, have a look at my configs

Posted: Thu Sep 24, 2015 8:04 pm
by vvz
I run command and have
got next lines in logs
Sep 24 21:59:15 callme-crt-vnode1 nagios: SERVICE NOTIFICATION: pagerduty-critical;vnode2-concert-site;root partition amount;CUSTOM (WARNING);notify-service-by-pagerduty;WARNING. root partition Size=519G Available=74G Used%=86%;nagiosadmin;WARNING. root partition Size=519G Available=74G Used%=86%
Didn't get notifications but looks like wrong configuration...
let me check... I'll let you know...
Thank you

Re: pls, have a look at my configs

Posted: Fri Sep 25, 2015 9:12 am
by Box293
Great, get back to us with what you find.

Re: pls, have a look at my configs

Posted: Fri Sep 25, 2015 12:11 pm
by vvz
no, I double check configs, still not able to figure out the problem

Re: pls, have a look at my configs

Posted: Fri Sep 25, 2015 12:17 pm
by vvz
after running "Send custom service notification"
for the line in log:
Sep 24 21:59:15 callme-crt-vnode1 nagios: SERVICE NOTIFICATION: pagerduty-critical;vnode2-concert-site;root partition amount;CUSTOM (WARNING);notify-service-by-pagerduty;WARNING. root partition Size=519G Available=74G Used%=86%;nagiosadmin;WARNING. root partition Size=519G Available=74G Used%=86%

SERVICE NOTIFICATION: pagerduty-critical - means notification goes to pagerduty-critical contact, but according to my configs provided, it should go to pagerduty-warning.
It's obviously the problem, but I do not understand why

Re: pls, have a look at my configs

Posted: Fri Sep 25, 2015 12:38 pm
by Box293
vvz wrote:SERVICE NOTIFICATION: pagerduty-critical - means notification goes to pagerduty-critical contact, but according to my configs provided, it should go to pagerduty-warning.
It's obviously the problem, but I do not understand why
The purpose of sending a custom service notification was to purely check that notifications are working. It should have sent a notification to both the warning contact and the critical contacts. Was there a "SERVICE NOTIFICATION: pagerduty-warning" logged as well?

Re: pls, have a look at my configs

Posted: Fri Sep 25, 2015 12:39 pm
by vvz
no, it wasn't

Re: pls, have a look at my configs

Posted: Fri Sep 25, 2015 12:45 pm
by vvz
just re-run command
Sep 25 14:43:05 callme-crt-vnode1 nagios: EXTERNAL COMMAND: SEND_CUSTOM_SVC_NOTIFICATION;vnode2-concert-site;root partition amount;0;nagiosadmin;test notification
Sep 25 14:43:05 callme-crt-vnode1 nagios: SERVICE NOTIFICATION: pagerduty-critical;vnode2-concert-site;root partition amount;CUSTOM (WARNING);notify-service-by-pagerduty;WARNING. root partition Size=519G Available=81G Used%=84%;nagiosadmin;test notification

Re: pls, have a look at my configs

Posted: Fri Sep 25, 2015 2:25 pm
by ssax
You've restarted the nagios service, right?

Code: Select all

service nagios restart
Please attach your /usr/local/nagios/etc/nagios.cfg file so that we can look through it.

Do you have any errors in your notify-service-by-pagerduty script? Are the permissions proper (executable) on it?