Freshness Checks
Re: Freshness Checks
Can you post the text config for one of these services, along with the service template that it's referencing. I think I'm having a hard time following what's going on from the screenshots. Access the Core Config Manager->Services->Download(icon) for one of these services. Do the same for the service templates.
Re: Freshness Checks
I have added 3 config files.
The first is from 1 of the hosts. In this case its the core manager itself which has services that actually give data even though the host itself is down but it also has services which are noted as down and unreachable.
The second is from the services itself that are set to those hosts.
And lastly the service templates itself.
The first is from 1 of the hosts. In this case its the core manager itself which has services that actually give data even though the host itself is down but it also has services which are noted as down and unreachable.
The second is from the services itself that are set to those hosts.
And lastly the service templates itself.
You do not have the required permissions to view the files attached to this post.
Re: Freshness Checks
You have various check commands defined for each of these services which are giving you the undesired results.
For example:
The above service definition is called check_load against localhost as the freshness check. This will almost always return an OK state and won't tell you anything about the service you're actually wanting to monitor.
You need to remove these defined check commands for your passive services and change them to something like:
check_dummy
$ARG1$ 0
$ARG2$ "Service results are stale!"
For example:
The above service definition will execute the check_hp!public command as the freshness check. If the host is down, this will return a critical resultdefine service {
host_name ars-db-madrid,ars-db-rome,ars-sbs-amsterdam
service_description adm-hp health
use ars-generic-service-passive
check_command check_hp!public
define service {
host_name ars-osm
service_description perf-cpu load
use ars-generic-service-passive
check_command check_load!5.0!4.0!3.0!10.0!6.0!4.0!!
register 1
}
The above service definition is called check_load against localhost as the freshness check. This will almost always return an OK state and won't tell you anything about the service you're actually wanting to monitor.
You need to remove these defined check commands for your passive services and change them to something like:
check_dummy
$ARG1$ 0
$ARG2$ "Service results are stale!"
Re: Freshness Checks
ok, so if i understand correctly, when i am using a distributed monitoring setup i need to remove the check commands on the central side and make sure they use the freshness check instead. I always thought i needed to have the commands defined on both ends to make it work. Am gonna try this and will let you know the result.
Re: Freshness Checks
It has worked. yay to you for helping me resolve this. It would help if the documentation is a bit more clearer on this. In particular the part where the service has to be defined without the actual check so the freshness can kick in.
Re: Freshness Checks
We'll take a look and see what we've got in our existing documentation. For the passive checks to be received, all you need is a host_name and service_description that match on the central server for the passive check to be received. Another way that could be even easier to set these up is to use the Admin->Unconfigured Objects page. This page allows you to just fire passive checks at the central server, and Nagios XI will ask if you want to add them.