Hi all,
I have a "nagios problem" and I don't know what is the easiest solution.
In my nagios we have a lot of monitored servers. each server are simply declared with "define host" . In these host definition, there is an "hostgroups" parameters that define the hostgroup the host belong to.
In these differents hostgroups are defined the services. (the checks)
these check are using a template, which contain a "timeperiod" for 27x7 monitoring.
This is of a great way to monitor all the servers without having to duplicate the "services" (as they are defined one time in the hostgroups)
Now come my pb : recently I needed to add several new servers. But these servers are down/unreachable a few hours every saturday.
So my wish would be to apply a specific downtime "timeperiod" for these servers and checks. These new servers are currently using the same hostgroups as the other host, so the same check, same template, and same timeperiod..
So to apply a timeperiod on all my specific "hosts", I just need to add a "timeperiod" specific parameter in the hosts definition, and this is working fine. But too apply a specific timeperiod to the "services" applied to these specific "host", I don't know how to do it the smartest possible^^, without modifiying the timeperiod of the check of the "normal" hosts (24x7 hosts)
Of course I can duplicate every "service" definition, but this is not a good solution.
But Currently i do'nt see another solution...is there a proper way to have such exception?
thx a lot in advance
timeperiod exception
Re: timeperiod exception
To better understand the topology, could you please post the host / service / hostgroup definitions for us to look at?
Former Nagios Employee
-
vercetty92
- Posts: 5
- Joined: Thu Dec 19, 2013 4:25 am
Re: timeperiod exception
Of course,
here is a 24x7 host definition:
here is the content of "linux-bonding" group (for example):
now, here is a definition of the hosts not 24x7, but using a specific timezone "test-timezone":
So this host use almost the same groups that contain the same services as for the 24x7 hosts.
it's ok to force the "check_period" for each no 24x7 hosts, it's just a line to add in the host declaration....but for the services I'm blocked^^
The best thing would be to not have to duplicate each hostgroup and services and apply the test-timezone to these duplicated service (because there is a lot of services, and the templates contain some threshold applied to all servers, and we would like to keep these setting at only one place).
Hope this is clear^^
Don't hesitate to ask me more details if needed.
here is a 24x7 host definition:
Code: Select all
define host{
use generic-host-ulsysnet --> this template say "check_period 24x7" (and contain another things)
host_name myhost
alias myhost
address myhost.domain.lan
hostgroups generic-servers, london-servers, ntp-external, redhat-servers, unix-servers, linux-bonding
}
Code: Select all
define hostgroup{
hostgroup_name linux-bonding
alias Linux Server With Bonding
}
define service {
use generic-services --> this template say "check_period 24x7" (and contain another things)
hostgroup_name linux-bonding
service_description Bonding
check_command check_nrpe_1arg!check_linux_bonding
}
Code: Select all
define host{
use generic-host-ulsysnet --> this template still say "check_period 24x7" (and contain another things)
host_name host.not.24x7
alias host.not.24x7
address host.not.24x7.domain.lan
check_period test-timezone --> but here I force the check_period for this host, so it's not anymore a 24x7 monitoring (but just for the hosts, not for the services)
hostgroups generic-servers, redhat-servers, tokyo-servers, ntp-metabit-dev-uat, unix-servers, linux-bonding, metabit-srv, hp-blade
}
it's ok to force the "check_period" for each no 24x7 hosts, it's just a line to add in the host declaration....but for the services I'm blocked^^
The best thing would be to not have to duplicate each hostgroup and services and apply the test-timezone to these duplicated service (because there is a lot of services, and the templates contain some threshold applied to all servers, and we would like to keep these setting at only one place).
Hope this is clear^^
Don't hesitate to ask me more details if needed.
Re: timeperiod exception
If you know this is a scheduled downtime, I would suggest a cron job that sets these hosts/services into downtime, and takes them out as needed. You could use the following list of commands with examples to do so:
http://old.nagios.org/developerinfo/ext ... ndlist.php
Any of the commands with "DOWNTIME" in the name are what you will want to look for, and you would need to do some scripting, but this is a pretty safe way of keeping your configs clean.
http://old.nagios.org/developerinfo/ext ... ndlist.php
Any of the commands with "DOWNTIME" in the name are what you will want to look for, and you would need to do some scripting, but this is a pretty safe way of keeping your configs clean.
Former Nagios employee
-
vercetty92
- Posts: 5
- Joined: Thu Dec 19, 2013 4:25 am
Re: timeperiod exception
Thx a lot for your answer, I will take a look at this.
But this confirm what I thought, nothing to do in the config file
But this confirm what I thought, nothing to do in the config file
Re: timeperiod exception
Probably could be done in configs but it sounds like it would be a mess to keep clean.
Want us to keep this open while you try things out or are we good to close it up?
Want us to keep this open while you try things out or are we good to close it up?
Former Nagios employee