Page 1 of 1
Check service at same time every day - persistenly
Posted: Fri Jan 24, 2014 4:32 am
by FTL
Hi All,
I have a few service checks that checks the job status of Scheduled Tasks of Windows Servers.
They run every 1 day between 9-9.30am on each of the servers.
However if i restart nagios service or reboot server i lose the set check times and have to go manually in and re-schedule next check back to the required time of day on for each service
Is it possible to make the service check persistenly run at a set time of the day - even through reboots and service restarts?
Thankyou
Re: Check service at same time every day - persistenly
Posted: Fri Jan 24, 2014 2:35 pm
by slansing
Okay, so you are manually setting a check reschedule time? You are not setting these services up on a time period that only exists between 9-9:30am in their service definition?
Re: Check service at same time every day - persistenly
Posted: Mon Jan 27, 2014 5:19 am
by FTL
Hi Slansig,
Sorry, appears i wasnt clear enough in my description.
I have 6 servers that i run a service check on to check the status of the scheduled tasks that are running.
Server 1 - 9.00am
Server 2 - 9.05am
Server 3 - 9.10am
Server 4 - 9.15am
Server 5 - 9.20am
Server 6 - 9.25am
My host and service check for 1 server as an example:
Code: Select all
define service{
use service-schedtask ; See Service Template section below
host_name SERVER 1
service_description SCHEDULED TASK RESULT
check_command check_schedtasks ; See Commands section below
}
The checks run once every 24 hours as defined in my service-schedtask template
Code: Select all
define service{
name service-schedtask ; The name of this host template (used above in the checks)
check_period server_24x7 ; Server are monitored at all times
check_interval 1440 ; Server are checked every 1 day when in OK state
retry_interval 180 ; Server checked every 3 hours if in problem state
max_check_attempts 3 ; Server checked 3 times to determine if its Up or Down state
notification_period server_24x7 ; Emails and Text are sent out any time of day
notification_interval 180 ; Resend Notifications every 3 hours
notification_options c,r ; Only send alerts for servers in CRITICAL or RECOVERY state
notifications_enabled 1 ; Notifications are enabled
contact_groups servers email, servers sms ; Alerts sent to contacts in these groups
event_handler_enabled 1 ; Host event handler is enabled
process_perf_data 1 ; Performace data is processed
retain_status_information 1 ; Status Info is kept between server restarts
retain_nonstatus_information 1 ; Non-Status information is kept between server restarts
passive_checks_enabled 0 ; Passive Checks are disabled
obsess_over_service 0 ; We do not obsess over the server if in problem state
check_freshness 0 ; We do not check this server for freshness
flap_detection_enabled 0 ; Flap Detection is disabled
failure_prediction_enabled 0 ; We will wait for it to actually fail thankyou!!
register 0
}
So after Nagios is rebooted i manually went in and forced a scheduled check of the service(s) on their respective hosts at the specific times set above.
If Nagios stays up then this works fine - Server 1 will get checked at 9am daily, server 2 at 9.05am etc etc
But if i restart the Nagios server or restart the nagios service, it loses those times.
So say i reboot Nagios at 6pm one evening, when it comes back up it might check server 1 at 7.24pm, server 2 at 9.43pm etc etc.
I then have to go back in and manually force reschedule the checks to run again for each service at the times said above.
Now i understand this is normal as when Nagios restarts it re-schedules all the checks it does depending on its load from the moment it comes back up.
My question is can i make these particular service checks on the 6 servers run at the times set above persistently through reboots and restarts without having to go in and do a manual rescheduled check at said times if the server/service gets restarted?
Thanks
Re: Check service at same time every day - persistenly
Posted: Mon Jan 27, 2014 11:59 am
by sreinhardt
The hardest part would be getting it at those exact moments. My first suggestion would be to define a timeperiod that check_period will use, that will restrict it to 9-10 or so maybe 9-9:30, and use that for all of these service checks. The issue with this is that they could get jumbled and not check in the right order. In that case you might need to define a 5 minute time period for each different service check, and use that to specifically inform the nagios engine of when to check it. Does that make more sense to you? I am happy to give an example if needed.
Re: Check service at same time every day - persistenly
Posted: Tue Jan 28, 2014 4:54 am
by FTL
Yes good suggestion - didn't think of that.
The timeperiod would work as I'm not really fussed which order they are checked in - as long as they are all checked just between 9-9.30 each morning.
Its only between those hours so its first thing in the morning and the appropriate admin can sort any issues out before getting snowed under and forgetting it!
Thankyou Sreinhardt
Re: Check service at same time every day - persistenly
Posted: Tue Jan 28, 2014 2:44 pm
by tmcdonald
We can leave this thread open until you test that time period setup out, otherwise if you are satisfied it will work we can close it now. Up to you.
Re: Check service at same time every day - persistenly
Posted: Wed Jan 29, 2014 5:37 am
by FTL
Cant seem to get it working
apologies for the formatting of the code - thats linux way of telling me not to copy and paste into Windows first
I have set the service check:
Code: Select all
define service{
use service-schedtask ; See Service Template section below
host_name SERVER1
service_description SCHEDULED TASK RESULT
check_command check_schedtasks ; See Commands section below
}
The template that belongs to that service check:
Code: Select all
define service{
name service-schedtask ; The name of this host template (used above in the checks)
check_period server_schedtask ; Service is monitored only between 9am and 9.30am daily
check_interval 1440 ; Service is checked every 1 day when in OK state
retry_interval 180 ; Service is checked every 3 hours if in problem state
max_check_attempts 3 ; Service is checked 3 times to determine if its Up or Down state
notification_period server_24x7 ; Emails and Text are sent out any time of day
notification_interval 180 ; Resend Notifications every 3 hours
notification_options c,r ; Only send alerts for servers in CRITICAL or RECOVERY state
notifications_enabled 1 ; Notifications are enabled
contact_groups servers email, servers sms ; Alerts sent to contacts in these groups
event_handler_enabled 1 ; Host event handler is enabled
process_perf_data 1 ; Performace data is processed
retain_status_information 1 ; Status Info is kept between server restarts
retain_nonstatus_information 1 ; Non-Status information is kept between server restarts
passive_checks_enabled 0 ; Passive Checks are disabled
obsess_over_service 0 ; We do not obsess over the server if in problem state
check_freshness 0 ; We do not check this server for freshness
flap_detection_enabled 0 ; Flap Detection is disabled
failure_prediction_enabled 0 ; We will wait for it to actually fail thankyou!!
register 0
}
The time period that belongs to that template:
Code: Select all
define timeperiod{
timeperiod_name server_schedtask
alias Half Hour Period for scheduled task checks
sunday 09:00-09:30
monday 09:00-09:30
tuesday 09:00-09:30
wednesday 09:00-09:30
thursday 09:00-09:30
friday 09:00-09:30
saturday 09:00-09:30
}
However even after this morning restarting the service and even restarting the server it doesnt schedule the next check to be in this timeperiod
1 of the server shows : Next Scheduled Check: 01-29-2014 23:31:06
Another shows: Next Scheduled Check: 01-29-2014 17:40:09
Another shows: Next Scheduled Check: 01-29-2014 17:39:58
I cant see what i have missed.
Is it this line from the template?
retain_nonstatus_information 1 ; Non-Status information is kept between server restarts
Should this be 0?
Re: Check service at same time every day - persistenly
Posted: Thu Jan 30, 2014 12:39 pm
by lmiltchev
Hm-m, the timeperiod looks fine. Try disabling the retention of non-status information and see if this is going to help. Setting "retain_nonstatus_information = 0" will cause nagios to take the initial values from the configs, rather than form the state retention file when it restarts. Hope this helps.
Re: Check service at same time every day - persistenly
Posted: Fri Jan 31, 2014 4:47 am
by FTL
It appears it needed to do the final check it thought it was doing at those wrong times before picking up the new timeperiod check times.
All 6 servers have now re-scheduled themselves and checking at 9am. Would like them seperated but as long as they are checked im happy with this.
Thanks for all your help guys.
Re: Check service at same time every day - persistenly
Posted: Fri Jan 31, 2014 10:11 am
by tmcdonald
Alright, well I'll go ahead and lock this up as Solved. If you have any problems in the future feel free to start a new topic.