Check service at same time every day - persistenly

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
FTL
Posts: 72
Joined: Fri Oct 21, 2011 7:23 am

Check service at same time every day - persistenly

Post by FTL »

Hi All,

I have a few service checks that checks the job status of Scheduled Tasks of Windows Servers.

They run every 1 day between 9-9.30am on each of the servers.

However if i restart nagios service or reboot server i lose the set check times and have to go manually in and re-schedule next check back to the required time of day on for each service

Is it possible to make the service check persistenly run at a set time of the day - even through reboots and service restarts?

Thankyou
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Check service at same time every day - persistenly

Post by slansing »

Okay, so you are manually setting a check reschedule time? You are not setting these services up on a time period that only exists between 9-9:30am in their service definition?
FTL
Posts: 72
Joined: Fri Oct 21, 2011 7:23 am

Re: Check service at same time every day - persistenly

Post by FTL »

Hi Slansig,

Sorry, appears i wasnt clear enough in my description.

I have 6 servers that i run a service check on to check the status of the scheduled tasks that are running.

Server 1 - 9.00am
Server 2 - 9.05am
Server 3 - 9.10am
Server 4 - 9.15am
Server 5 - 9.20am
Server 6 - 9.25am

My host and service check for 1 server as an example:

Code: Select all


define service{
    use            service-schedtask              ; See Service Template section below
    host_name        SERVER 1
    service_description    SCHEDULED TASK RESULT
    check_command        check_schedtasks              ; See Commands section below
    }
The checks run once every 24 hours as defined in my service-schedtask template

Code: Select all

define service{
    name                 service-schedtask        ; The name of this host template (used above in the checks)
    check_period             server_24x7        ; Server are monitored at all times
    check_interval             1440            ; Server are checked every 1 day when in OK state
    retry_interval             180            ; Server checked every 3 hours if in problem state
    max_check_attempts         3                ; Server checked 3 times to determine if its Up or Down state
    notification_period         server_24x7        ; Emails and Text are sent out any time of day
    notification_interval         180            ; Resend Notifications every 3 hours
    notification_options         c,r            ; Only send alerts for servers in CRITICAL or RECOVERY state
    notifications_enabled         1                ; Notifications are enabled
    contact_groups             servers email, servers sms    ; Alerts sent to contacts in these groups
    event_handler_enabled         1                ; Host event handler is enabled
    process_perf_data         1                ; Performace data is processed
    retain_status_information    1                ; Status Info is kept between server restarts
    retain_nonstatus_information 1                ; Non-Status information is kept between server restarts
    passive_checks_enabled         0                ; Passive Checks are disabled
    obsess_over_service         0                 ; We do not obsess over the server if in problem state
    check_freshness              0                 ; We do not check this server for freshness
    flap_detection_enabled         0                ; Flap Detection is disabled
    failure_prediction_enabled   0                ; We will wait for it to actually fail thankyou!!
    register              0
    }
So after Nagios is rebooted i manually went in and forced a scheduled check of the service(s) on their respective hosts at the specific times set above.

If Nagios stays up then this works fine - Server 1 will get checked at 9am daily, server 2 at 9.05am etc etc

But if i restart the Nagios server or restart the nagios service, it loses those times.
So say i reboot Nagios at 6pm one evening, when it comes back up it might check server 1 at 7.24pm, server 2 at 9.43pm etc etc.

I then have to go back in and manually force reschedule the checks to run again for each service at the times said above.

Now i understand this is normal as when Nagios restarts it re-schedules all the checks it does depending on its load from the moment it comes back up.

My question is can i make these particular service checks on the 6 servers run at the times set above persistently through reboots and restarts without having to go in and do a manual rescheduled check at said times if the server/service gets restarted?

Thanks
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Check service at same time every day - persistenly

Post by sreinhardt »

The hardest part would be getting it at those exact moments. My first suggestion would be to define a timeperiod that check_period will use, that will restrict it to 9-10 or so maybe 9-9:30, and use that for all of these service checks. The issue with this is that they could get jumbled and not check in the right order. In that case you might need to define a 5 minute time period for each different service check, and use that to specifically inform the nagios engine of when to check it. Does that make more sense to you? I am happy to give an example if needed.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
FTL
Posts: 72
Joined: Fri Oct 21, 2011 7:23 am

Re: Check service at same time every day - persistenly

Post by FTL »

Yes good suggestion - didn't think of that.

The timeperiod would work as I'm not really fussed which order they are checked in - as long as they are all checked just between 9-9.30 each morning.
Its only between those hours so its first thing in the morning and the appropriate admin can sort any issues out before getting snowed under and forgetting it!

Thankyou Sreinhardt
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Check service at same time every day - persistenly

Post by tmcdonald »

We can leave this thread open until you test that time period setup out, otherwise if you are satisfied it will work we can close it now. Up to you.
Former Nagios employee
FTL
Posts: 72
Joined: Fri Oct 21, 2011 7:23 am

Re: Check service at same time every day - persistenly

Post by FTL »

Cant seem to get it working :(

apologies for the formatting of the code - thats linux way of telling me not to copy and paste into Windows first :)

I have set the service check:

Code: Select all

define service{
    use                            service-schedtask               ; See Service Template section below
    host_name                 SERVER1
    service_description    SCHEDULED TASK RESULT
    check_command         check_schedtasks               ; See Commands section below
    }
The template that belongs to that service check:

Code: Select all

define service{
    name                                       service-schedtask        ; The name of this host template (used above in the checks)
    check_period                            server_schedtask        ; Service is monitored only between 9am and 9.30am daily
    check_interval                          1440                           ; Service is checked every 1 day when in OK state
    retry_interval                            180                            ; Service is checked every 3 hours if in problem state
    max_check_attempts                 3                               ; Service is checked 3 times to determine if its Up or Down state
    notification_period                     server_24x7              ; Emails and Text are sent out any time of day
    notification_interval                   180                           ; Resend Notifications every 3 hours
    notification_options                    c,r                            ; Only send alerts for servers in CRITICAL or RECOVERY state
    notifications_enabled                  1                              ; Notifications are enabled
    contact_groups                          servers email, servers sms    ; Alerts sent to contacts in these groups
    event_handler_enabled               1                                ; Host event handler is enabled
    process_perf_data                      1                                ; Performace data is processed
    retain_status_information           1                               ; Status Info is kept between server restarts
    retain_nonstatus_information      1                               ; Non-Status information is kept between server restarts
    passive_checks_enabled              0                               ; Passive Checks are disabled
    obsess_over_service                   0                               ; We do not obsess over the server if in problem state
    check_freshness                         0                               ; We do not check this server for freshness
    flap_detection_enabled               0                               ; Flap Detection is disabled
    failure_prediction_enabled          0                               ; We will wait for it to actually fail thankyou!!
    register                                      0
    }
The time period that belongs to that template:

Code: Select all

define timeperiod{
    timeperiod_name        server_schedtask
    alias                           Half Hour Period for scheduled task checks
    sunday                       09:00-09:30
    monday                     09:00-09:30
    tuesday                     09:00-09:30
    wednesday                 09:00-09:30
    thursday                    09:00-09:30
    friday                        09:00-09:30
    saturday                    09:00-09:30
    }

However even after this morning restarting the service and even restarting the server it doesnt schedule the next check to be in this timeperiod

1 of the server shows : Next Scheduled Check: 01-29-2014 23:31:06
Another shows: Next Scheduled Check: 01-29-2014 17:40:09
Another shows: Next Scheduled Check: 01-29-2014 17:39:58

I cant see what i have missed.

Is it this line from the template?
retain_nonstatus_information 1 ; Non-Status information is kept between server restarts

Should this be 0?
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Check service at same time every day - persistenly

Post by lmiltchev »

Hm-m, the timeperiod looks fine. Try disabling the retention of non-status information and see if this is going to help. Setting "retain_nonstatus_information = 0" will cause nagios to take the initial values from the configs, rather than form the state retention file when it restarts. Hope this helps.
Be sure to check out our Knowledgebase for helpful articles and solutions!
FTL
Posts: 72
Joined: Fri Oct 21, 2011 7:23 am

Re: Check service at same time every day - persistenly

Post by FTL »

It appears it needed to do the final check it thought it was doing at those wrong times before picking up the new timeperiod check times.

All 6 servers have now re-scheduled themselves and checking at 9am. Would like them seperated but as long as they are checked im happy with this.

Thanks for all your help guys.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Check service at same time every day - persistenly

Post by tmcdonald »

Alright, well I'll go ahead and lock this up as Solved. If you have any problems in the future feel free to start a new topic.
Former Nagios employee
Locked