Page 1 of 1

Best practice for checks with slightly different parameters

Posted: Thu Aug 18, 2016 6:02 am
by daveh
Can anyone advise on the best practice for applying a particular service check to a large number of hosts, but to use slightly different parameters for one or two hosts? I don't believe you can pick up on host arguments within a service check.

An example is a check I have to make sure Cisco routers are reporting the expected number of power supplies present. We have a host group for cisco routers and I have applied my service check to the whole group. By default this checks for 2 power supplies and gives an error if less than two are found. The service check definition has '2' as one of the arguments. However a couple of our big chassis have 4 power supplies. What I really want to do is be able to apply the service check to the whole host group, have the argument as 2 for everything unless I override it for the odd exception. Is this possible?

For our servers, we have run into similar problems where some are expected to have very high disk space usage. We have ended up having a normal disk space service check to apply and high usage service check to apply. I could do the same, however that will mean that whoever adds a new Cisco router has to remember to add the 2 or the 4 power supply check. At the moment it is quite nice that we add a router with the 'cisco' template which puts it in the host group which then ensures all routers get all the service checks I intended.

If there is a better way of doing this sort of thing (not specific to Cisco power supplies, that is just an example), I'd appreciate some help.

Re: Best practice for checks with slightly different paramet

Posted: Thu Aug 18, 2016 9:28 am
by tmcdonald
I think this sentence sums it up well:
What I really want to do is be able to apply the service check to the whole host group, have the argument as 2 for everything unless I override it for the odd exception. Is this possible?
You're looking for templates. Basically you would define "2" as the argument in a template, and have the services use that template and just not define anything themselves for that argument. Then whenever you need to deviate away from 2, you make sure to supply the correct number on the service and that will override the setting in the used template.

And for the record, quite a lot of host info is available in a service check:

https://assets.nagios.com/downloads/nag ... olist.html

$HOSTNOTES$ in particular might be of interest, otherwise you could look into custom variables and macros:

https://assets.nagios.com/downloads/nag ... tvars.html

Edit: Actually, if you are applying this to the whole hostgroup then the template option will not work. You would need to use the custom variables and refer to specific settings per-host.