check-host-alive - notification problem

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
simonl
Posts: 14
Joined: Tue Sep 24, 2013 8:38 pm

check-host-alive - notification problem

Post by simonl »

Hello,
im building a new nagios environment and have notification alerts going through pager duty.
all alerts are working except for some reason when i bring down a server I can't get a notification triggered. Other services are alerting fine.
Here's part of the nagios.log showing the host alert detected but no notification sent out. (The notification you are seeing is a service notification. I need the notification that the host is down and that isn't working).

Also is there a way to trigger this alert without actually bring down the server to test?

Any help will be greatly appreciated:

[1403305466] HOST ALERT: mr11p01ad-websrvr001.iad.apple.com;DOWN;SOFT;3;CRITICAL - Host Unreachable (17.178.71.24)
[1403305536] HOST ALERT: mr11p01ad-websrvr001.iad.apple.com;DOWN;SOFT;4;CRITICAL - Host Unreachable (17.178.71.24)
[1403305606] HOST ALERT: mr11p01ad-websrvr001.iad.apple.com;DOWN;HARD;5;CRITICAL - Host Unreachable (17.178.71.24)
[1403305676] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;CPU LOAD;CRITICAL;HARD;1;Connection refused or timed out
[1403305776] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;root;CRITICAL;HARD;1;Connection refused or timed out
[1403305796] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;NTP;CRITICAL;HARD;1;NTP CRITICAL: No response from NTP server
[1403306026] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;SSH;CRITICAL;HARD;1;No route to host
[1403306026] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;PING;CRITICAL;HARD;3;CRITICAL - Host Unreachable (17.178.71.24)
[1403306036] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;SWAP;CRITICAL;HARD;3;Connection refused or timed out
[1403306276] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;CPU LOAD;CRITICAL;HARD;1;Connection refused or timed out
[1403306376] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;root;CRITICAL;HARD;1;Connection refused by host
[1403306386] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;NTP;OK;HARD;1;NTP OK: Offset unknown
[1403306396] HOST ALERT: mr11p01ad-websrvr001.iad.apple.com;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.22 ms
[1403306626] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;SSH;OK;HARD;1;SSH OK - OpenSSH_5.3 (protocol 2.0)
[1403306626] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;PING;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 0.22 ms
[1403306636] SERVICE NOTIFICATION: pagerduty;mr11p01ad-websrvr001.iad.apple.com;SWAP;CRITICAL;notify-service-by-pagerduty;Connection refused by host
[1403306636] SERVICE NOTIFICATION: nagiosadmin;mr11p01ad-websrvr001.iad.apple.com;SWAP;CRITICAL;notify-service-by-email;Connection refused by host
[1403306796] Auto-save of retention data completed successfully.
[1403306876] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;CPU LOAD;CRITICAL;SOFT;1;Connection refused by host
[1403306976] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;root;CRITICAL;SOFT;1;Connection refused by host
[1403306996] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;CPU LOAD;CRITICAL;SOFT;2;Connection refused by host
[1403307096] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;root;CRITICAL;SOFT;2;Connection refused by host
[1403307116] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;CPU LOAD;CRITICAL;HARD;3;Connection refused by host
[1403307116] SERVICE NOTIFICATION: pagerduty;mr11p01ad-websrvr001.iad.apple.com;CPU LOAD;CRITICAL;notify-service-by-pagerduty;Connection refused by host
[1403307116] SERVICE NOTIFICATION: nagiosadmin;mr11p01ad-websrvr001.iad.apple.com;CPU LOAD;CRITICAL;notify-service-by-email;Connection refused by host
[1403307216] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;root;CRITICAL;HARD;3;Connection refused by host
[1403307216] SERVICE NOTIFICATION: pagerduty;mr11p01ad-websrvr001.iad.apple.com;root;CRITICAL;notify-service-by-pagerduty;Connection refused by host
[1403307216] SERVICE NOTIFICATION: nagiosadmin;mr11p01ad-websrvr001.iad.apple.com;root;CRITICAL;notify-service-by-email;Connection refused by host
[1403308056] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;crontab;CRITICAL;SOFT;1;Connection refused by host
[1403308176] SERVICE ALERT: mr11p01ad-websrvr001.iad.apple.com;crontab;CRITICAL;SOFT;2;Connection refused by host
simonl
Posts: 14
Joined: Tue Sep 24, 2013 8:38 pm

Re: check-host-alive - notification problem

Post by simonl »

here's additional info:
I have noticed the following when restarting nagios. Im wondering if this is causing the problem BUT I'm getting all other alerts EXCEPT server down coming from check-host-alive

[1403441564] Warning: Host 'mr11p01ad-etlsrvr001.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-matchappsrvr001.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-matchappsrvr002.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-ncsdb001.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-ncsdb002.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-ncssas001.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-ncssas002.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-segappsrvr001.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-websrvr001.iad.apple.com' has no default contacts or contactgroups defined!
User avatar
eloyd
Cool Title Here
Posts: 2129
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: check-host-alive - notification problem

Post by eloyd »

Are you receiving the pagerduty notices?
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoydI'm a Nagios Fanatic!
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: check-host-alive - notification problem

Post by Box293 »

simonl wrote:Also is there a way to trigger this alert without actually bring down the server to test?
You can test if the alerts are working by using the Host Command "Send custom host notification".

simonl wrote:[1403441564] Warning: Host 'mr11p01ad-etlsrvr001.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-matchappsrvr001.iad.apple.com' has no default contacts or contactgroups defined!
[1403441564] Warning: Host 'mr11p01ad-matchappsrvr002.iad.apple.com' has no default contacts or contactgroups defined!
If the test above does not work for these hosts then you may not have any contacts defined for the hosts. Check your hosts config file for the contacts or contact_groups directives. If they are not present, add them. If you are using a template, check the template config file for the same.

Don't forget to restart nagios if you make any changes.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
simonl
Posts: 14
Joined: Tue Sep 24, 2013 8:38 pm

Re: check-host-alive - notification problem

Post by simonl »

I apologize for not getting back to the forum sooner.
I realized that there were no contact groups defined where I was calling the check-host-alive in the linux.cfg file.
Thanks all for looking and thinking about this.
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: check-host-alive - notification problem

Post by lmiltchev »

I am glad your issue has been resolved! I am locking this post now. If you have any more questions/issues, please, start a new thread.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked