I'm working as system engineer and I'm the person responsible for our Nagios monitoring system (based on Nagios Core version 3.5.1).
I'm very happy with that version which works as expected.
We currently have more than 700 hosts for more than 4000 services.
I'm wondering if it was possible to auto-acknowledge a service if its dependency fails.
Let me explain our real situation :
- one host (srv-val0127)
- 2 services related to that host ("JBoss instance ACTIVE AJP/HTTP connectors" and "ACTIVE_THREAD")
I simply want to link both services :
- "ACTIVE_THREAD" is a passive check (from JBoss Operation Network which send traps to Nagios)
- I recently set the check freshness to detect if traps are still sent for that service
- "JBoss instance ACTIVE AJP/HTTP connectors" is an active check which is based on TCP (check for door HTTP/AJP to be open : custom check script)
When the JBoss instance is DOWN, no more trap are sent (of course ...) => Nagios execute the active check for "ACTIVE_THREAD" to tell me that no more trap seems to be send : "UNKNOWN: This check seems to no more receive traps from sender".
(in fact, I have 2 passives checks by instance, but for the exmple, we could speak only of 1 : enough)
So, when JBoss instance is DOWN, I get 2 checks with an issue :
- one WARN (or CRIT) => RED
- one UNKNOWN (trap) => ORANGE
I want to hide that "traps" check when the active check for HTTP/AJP fails. Is that possible ?
When instance is down for several days, I would like to only see the active check "JBoss instance ACTIVE AJP/HTTP connectors" failing, and hide all others (mainly passive check, for now).
Since we also have a console for the whole team (set to hide "acknowledged" services or "unreachable" host), I'm trying to hide those passive check by a kind of "auto-acknowledgment".
I tried doing this :
Code: Select all
#Little try
define servicedependency {
dependent_host_name srv-val0127
dependent_service_description ACTIVE_THREAD
host_name srv-val0127
service_description JBoss instance ACTIVE AJP/HTTP connectors
execution_failure_criteria o,w,u,c,p,n
notification_failure_criteria o,w,u,c,p,n
dependency_period 24x7
}
So, is there any way to do what I'm trying to do ? It could be the 7th skies of Nagios
Thanks