I have a few servers in a load balanced environment (ec2) and to save a bit I wish to stop the servers nightly. I am having some issues and trying to understand the hierarchy and looking for the best way to do this. For this example I will just use 1 set and once working can easily roll it out but I have 2 webservers. They belong to a host group.
There are services such as load, disk use, etc and the service has a hostgroup web.
Now in the timeperiods.cfg I have a 2nd timeperiod_name called workhours with what I want it to check. Here is where my question/problems come in. I did try to follow the flow in the docs, but after a few day's of time spent I wanted to just post this out there.
So all the servers (host) have a use linux-server in their entry. In the templates.cfg there is both;
generic-host (which has a notification period of 24x7)
linux-server (which has a use generic-host AND a notification period of 24x7)
AND
linux-server-day (which has a notification_period workhours) ## Note I created that in the time periods.cfg with the right times.
NOW, I removed the use generic-host from the linux-server-day and during the day and having issues (done this so many times forgot what is what), but with the use generic in I get the alerts all the time, if I remove it I get nothing if memory serves.
So, to sum up, I would like to have 2 hosts. web1 and web2. web2 is stopped at night and I don't want to be notified that it is down along with any of the services. I thought at the define host web2 I could just say use linux-server-day and that would be the top, but really lost.
Note this is using public V4 with the latest patch on Ubuntu 14.04LTS.
Thanks to all read/replies.
disable all notifications during time help
-
wf-lraymond
- Posts: 10
- Joined: Fri Apr 11, 2014 12:38 pm
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: disable all notifications during time help
Does public v4 mean you are on core 4 of some version or another? As for your direct issue, if the templates are set and inheriting as normal it should look something like:
generic-host template
has notification and check periods, and notification commands
linux-server
uses generic-host template, has only notification and check periods defined if any of those
Linux-server-day template
uses linux-server template, will inherit anything that is not defined on this template from linux-server and if still not defined generic-host
service object
uses linux-server-day, should really only have the host and service name, check command and some other basics, all the rest should be inherited.
With that said, my guess is that by removing either linux-server or generic-host, your end service object does not contain notification commands, but does have notification periods for 24x7(or so it sounds). I would suggest making sure your notification periods are set correctly in linux-server-day, with only one setting for each(the work hours timeperiod) and also using the rest of the templates as you were originally. This should give you back the standard notification commands, and cause linux-server-day time periods to be forced onto the service object. If that does not work, please post configs for all of the templates and objects mentioned.
generic-host template
has notification and check periods, and notification commands
linux-server
uses generic-host template, has only notification and check periods defined if any of those
Linux-server-day template
uses linux-server template, will inherit anything that is not defined on this template from linux-server and if still not defined generic-host
service object
uses linux-server-day, should really only have the host and service name, check command and some other basics, all the rest should be inherited.
With that said, my guess is that by removing either linux-server or generic-host, your end service object does not contain notification commands, but does have notification periods for 24x7(or so it sounds). I would suggest making sure your notification periods are set correctly in linux-server-day, with only one setting for each(the work hours timeperiod) and also using the rest of the templates as you were originally. This should give you back the standard notification commands, and cause linux-server-day time periods to be forced onto the service object. If that does not work, please post configs for all of the templates and objects mentioned.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
wf-lraymond
- Posts: 10
- Joined: Fri Apr 11, 2014 12:38 pm
Re: disable all notifications during time help
Sorry, I just meant I was using the public, stable version.Does public v4 mean you are on core 4 of some version or another?
As for the service/host issue, I understand it a bit more and was able to do some more testing and have a test service. I didn't notice there is a template for the service as well as the host, so I have;
24x7.host -> linux-server which has a user generic-host
workhour.host -> linux-server-day which has a generic-host-day
testservice -> generic-service-day (which has a workhour for timeperiod)
When I force critical and it's outside it works perfect, no notification. So the original post/question now changes a bit. I remove the use generic-service-day and get a bunch of errors, I thought it would look up to the host for that info but not the case. So all the above leads to this.
If I have 2 servers, one will be 24x7, one workhours, do I need to split the services and not use a host group so it would look like this;
server1 (24x7)
use linux-server
service 1
use generic-service
server2 (9-5)
use linux-server-day
service 1-day
use generic-service-day
The only issue I see is having to duplicate every service now. Each servers have the same checks, so I need 2 services for check_load, check_load_day, check_disk1, check_disk1_day and duplicate every one? I really thought if the service had no specific time it would take it from the host but that doesn't seem to be the case.
Anyway, it's a lot of copy/paste, but think it will work (unless there is a better way), but thanks.
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: disable all notifications during time help
Hostgroups and service groups should be just fine, just make sure that you are not assigning notification or check time periods. Do you really need to duplicate everything? You end goal is to have 24x7 checking but only durring work hours notifications correct? How about using scheduled downtime at night, to avoid notifications, but still do checks?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
wf-lraymond
- Posts: 10
- Joined: Fri Apr 11, 2014 12:38 pm
Re: disable all notifications during time help
No, this is being run in Amazon's EC2 and there very cost conscious right now until they get more clients. So I have (2) webservers, (2) API servers each behind a load balancer, and each are pretty costly. So we want to test, make sure things are running as they should, but at night via an admin shell script, I want to stop one of each of the instances (2 machines at .45/hr, 10 hours * 30 days) actually comes out to a lot of money.You end goal is to have 24x7 checking but only durring work hours notifications correct?
So during the day, 2 servers are in, if things like nginx fail, on either server I want to be notified, but at night, one of the servers will actually be in a stopped state, so no ping or any other check will pass.
Thanks again.
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: disable all notifications during time help
Ok they you are on one of two right paths, duplication is probably the easiest and best if you plan on using pulling reports from the monitored data. Putting things in scheduled downtime, or non-scheduled, but scripting for which host is down per night would also do it, but might get more tricky than duplication if you don't always know what host will be down which night.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.