Hi Fred -
Are you still getting alerts for services on a host when the host is DOWN? Nagios should be suppressing service notifications for hosts that are DOWN or UNREACHABLE.
Regarding checks, Nagios will still check services on a host that is DOWN or UNREACHABLE. The reason for this is that some services might be related to attributes/metrics related to the host, but that are not "on" the host. DNS name lookups are one example of services that might be associated with a host, but would still be in an OK state if the host went DOWN.
Currently there's not an easy way to prevent services from being checked when a host is DOWN. Some people have utilized service dependencies (referencing the Ping service as the service to use in the dependency) or external commands and event handlers to temporarily disable service checks when a host goes down, but they're not foolproof or easy to setup.
Connection Timeout/Refused State
Re: Connection Timeout/Refused State
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Ethan Galstad
President
Ethan Galstad
President
-
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Connection Timeout/Refused State
Hi
I don't have a problem with the services being checked of a host that's down, what I don't think is right is that they are shown as CRITICAL in Nagios.
When we have a host go DOWN, then we have many Critical events also showing which coud hide/mask a real CRITICAL event.
My way of thinking is that as the service has a dependancy of the host being up, then it should have an UNKNOWN status when the host goes down, because the actual CRITICAL threshold hasn't been reached.
regards... Fred
I don't have a problem with the services being checked of a host that's down, what I don't think is right is that they are shown as CRITICAL in Nagios.
When we have a host go DOWN, then we have many Critical events also showing which coud hide/mask a real CRITICAL event.
My way of thinking is that as the service has a dependancy of the host being up, then it should have an UNKNOWN status when the host goes down, because the actual CRITICAL threshold hasn't been reached.
regards... Fred
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Connection Timeout/Refused State
Actually, if you have service dependencies setup, Nagios won't check the dependent services if the services they depend on is in a critical state.
Re: Connection Timeout/Refused State
To my knowledge the services in this circumstance are considered to be "handled" problems instead of open or unhandled. The services reading as critical when the host is down is expected behavior. Some users may only be authorized at the service level, so the current state still needs to be accurately displayed.
-
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Connection Timeout/Refused State
The point I'm trying to make (unsuccessfuly it appears) is that the status of the service should be UNKNOWN . CRITICAL indicates that the service has crossed a threshold. Technically if the host is DOWN, then the results of a service check are UNKNOWN ?
Re: Connection Timeout/Refused State
This is the strategy we use. It's a pain to setup but worth it. Even though you don't get service notifications after the host is down, our service checks fail faster than our host checks, so we get slammed when a server goes down.Some people have utilized service dependencies (referencing the Ping service as the service to use in the dependency)