Page 1 of 1

Dynamic Service Monitoring - Info Request

Posted: Wed Oct 09, 2013 1:46 pm
by mrochelle
We have a table that contains hundreds of entries containing various information about files that we deliver to our customers (i.e. file name, time the file is due, minimum and maximum allowable file size, etc.). We have a current process that checks each file at its corresponding applicable time and issues an alert if the file doesn’t exist or doesn’t meet the minimum or maximum allowable file size requirements. We would like to adjust the process to integrate it into Nagios. We would like a separate alert for each file incident_date_time, but do not want to have to create a separate Nagios service for each of the hundreds of files.
Is there a way to dynamically create a service to do this? I'm already aware of one major concern, which would be the restart of Nagios with each config modification if this were a possibility.
Any thoughts or comments are appreciated.

Re: Dynamic Service Monitoring - Info Request

Posted: Wed Oct 09, 2013 1:52 pm
by BanditBBS
How often does the list of files get updated? You could write a script that could go through the list and write the hundreds of service definitions as static configuration files and then issue an "Apply Configuration" when completed. You could manually run this script anytime the list is updated or have it run daily by cron, or whatever.

Re: Dynamic Service Monitoring - Info Request

Posted: Wed Oct 09, 2013 2:05 pm
by mrochelle
The list gets updated every 10 to 15 mins at a minimum.

Re: Dynamic Service Monitoring - Info Request

Posted: Wed Oct 09, 2013 2:09 pm
by tmcdonald
This won't create a separate alert for each file, but might come close to what you want:

Create a single check against your host system. The check should give Nagios a 0/OK if no files are old/small/big, and give 2/Critical if there is even 1 file that needs attention. In your script, in addition to exiting with that 0 or 2, you can echo out additional information which will be shown in XI. Here's an example plugin I wrote that accomplishes this:

Code: Select all

#!/bin/bash

echo "UNKNOWN - Have you checked me?| Dead=0,Alive=0"

exit 3
I call it check_schrodinger since it always exits with 3/UNKNOWN. The echoed message appears in XI when I click the service as follows:

Image

Everything before the pipe is echoed to the screen. It would be trivially to make your script output an informative list of what files need to be addressed.

The only challenge would be if the table is hosted on a remote server, in which case you might need NRPE or something.

Re: Dynamic Service Monitoring - Info Request

Posted: Wed Oct 09, 2013 2:10 pm
by BanditBBS
That'd be real ugly having to restart that often and is definitely not something I'd do.

I'm not sure how I'd handle it then. I'd perhaps write a script to read the list and send an alert if anything is wrong. Then, maybe have the service reset to OK on its own after 1 minute so next time the check ran if there was another error, it would error out again and alert. You could also make the check return a list of all the files that had issues in the one alert. I can't think of any way to handle it like you originally described with a separate alert for each.

EDIT: tmcdonald beat me to it, I basically say what he said. Mine just wasn't written as nice, LOL

Re: Dynamic Service Monitoring - Info Request

Posted: Wed Oct 09, 2013 2:13 pm
by tmcdonald
BanditBBS wrote:EDIT: tmcdonald beat me to it, I basically say what he said. Mine just wasn't written as nice, LOL
I'm a dangerous man, Bandit.

Re: Dynamic Service Monitoring - Info Request

Posted: Wed Oct 09, 2013 2:36 pm
by mrochelle
Thanks for the info. I Believe I'm leaning toward a single script check and passing the results back to nagios for appropriate alerts. You can lock this down.