File check assistance

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

File check assistance

Post by jkinning »

I am having some difficulties in getting these 2 requests figured out.

1.) Check that files pamalm.txt, pamhold.txt, and pamsec.txt are available in directory e:\eas\server\batch\import\fwia\ on specific servers at 11:30 pm M – F. Send notification to Support only once.

I was thinking of using something like this:

Code: Select all

./check_nrpe -H eas1t -t 30 -c CheckFiles -a path="e:\\eas\server\batch\import\fwia\*.txt" max-dir-depth=0 MinCrit=0
but I wasn't sure how to create the check to only check at 11:30pm. Is it better to just always check and then send notification at 11:30pm if no files are present? The directory is normally empty but another job on a different server drops these 3 text files on this server after it completes. Today this is a manual task where people login to check to see if these files exist. We have Nagios so I am trying to get Nagios to eliminate the manual checking. Makes sense right?

2.) Check to make sure file e:\eas\server\log\alloc.log is not open for more than 5 minutes on server. Once that is tested they will up the time to around 2 hours.

I looked around for a "file open" plugin but didn't find anything that looked to fit the bill. I didn't know if check_nrpe had anything that could help or if anyone else is doing something that could help out with what I am trying to accomplish. I am trying to check to make sure the application didn't hang. The way I was thinking of doing it was watch the log file. When the application is processing the files it writes to this log file. This is a long process but shouldn't take long than 2 hours. I've heard it has taken a little over an hour to process everything. So if I could see that the log file was in use and have the service check initially for 5 minutes and then change that to 120 minutes or a little less to see if it was still in use or being written to and if so send out the notificaion.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: File check assistance

Post by jdalrymple »

Part 1) Scheduled recurring downtime?

Part 2) The only 'lsof' I'm aware of for Windows is handle. Maybe write a powershell that matches the string "No matching handles found."?

Code: Select all

D:\Users\jdalrymple\Downloads\Handle>Handle.exe c:\users\jdalrymple\NTUSER.DAT

Handle v4.0
Copyright (C) 1997-2014 Mark Russinovich
Sysinternals - www.sysinternals.com

System             pid: 4      type: File           304: C:\Users\jdalrymple\NTUSER.DAT{77a2c7ec-26f0-11e5-80da-e41d2d741090}.TxR.blf
System             pid: 4      type: File           524: C:\Users\jdalrymple\NTUSER.DAT
System             pid: 4      type: File           590: C:\Users\jdalrymple\NTUSER.DAT{77a2c7ed-26f0-11e5-80da-e41d2d741090}.TM.blf
System             pid: 4      type: File          11B4: C:\Users\jdalrymple\ntuser.dat.LOG2
System             pid: 4      type: File          1498: C:\Users\jdalrymple\NTUSER.DAT{77a2c7ec-26f0-11e5-80da-e41d2d741090}.TxR.0.regtrans-ms
System             pid: 4      type: File          1600: C:\Users\jdalrymple\NTUSER.DAT{77a2c7ed-26f0-11e5-80da-e41d2d741090}.TMContainer00000000000000000002.regtrans-ms
System             pid: 4      type: File          164C: C:\Users\jdalrymple\NTUSER.DAT{77a2c7ec-26f0-11e5-80da-e41d2d741090}.TxR.1.regtrans-ms
System             pid: 4      type: File          1688: C:\Users\jdalrymple\NTUSER.DAT{77a2c7ec-26f0-11e5-80da-e41d2d741090}.TxR.2.regtrans-ms
System             pid: 4      type: File          16C0: C:\Users\jdalrymple\ntuser.dat.LOG1
System             pid: 4      type: File          1788: C:\Users\jdalrymple\NTUSER.DAT{77a2c7ed-26f0-11e5-80da-e41d2d741090}.TMContainer00000000000000000001.regtrans-ms

D:\Users\jdalrymple\Downloads\Handle>Handle.exe "c:\users\jdalrymple\desktop\Clipboard01.jpg"

Handle v4.0
Copyright (C) 1997-2014 Mark Russinovich
Sysinternals - www.sysinternals.com

No matching handles found.
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

Re: File check assistance

Post by jkinning »

1.) I just received some additional information on this one. These 3 files need to exist but they are wondering if Nagios can check and if they are not in the directory by 11:30pm then send out a notification. I was using *.txt but there will be several .txt files but these 3 are the critical ones and if they are not there then notifications need to be sent.

Does that help?
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: File check assistance

Post by jdalrymple »

If it's "around 11:30" I'd still run with a scheduled downtime.
If it's "exactly 11:30" you'll likely have to do a passive check.

This:
https://exchange.nagios.org/directory/P ... nt/details
And this:
http://docs.nsclient.org/tutorial/nagios/nsca.html

Incidentally, if it was Linux I'd use this instead of a full blown agent:
https://exchange.nagios.org/directory/A ... er/details
I've never seen something similar for Windows.
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

Re: File check assistance

Post by jkinning »

I am not really following you, at least I don't think so.

You saying to use the checks

Code: Select all

./check_nrpe -H eas1t -t 30 -c CheckFiles -a path="e:\\eas\server\batch\import\fwia\pamalm.txt" max-dir-depth=0 MinCrit=0
./check_nrpe -H eas1t -t 30 -c CheckFiles -a path="e:\\eas\server\batch\import\fwia\pamhold.txt" max-dir-depth=0 MinCrit=0
./check_nrpe -H eas1t -t 30 -c CheckFiles -a path="e:\\eas\server\batch\import\fwia\pamsec.txt" max-dir-depth=0 MinCrit=0
and just have the notifications enabled using a timeperiod of 11:30pm?
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: File check assistance

Post by tmcdonald »

An active check may or may not fall on a specific schedule. If it has a check interval of 5 minutes, that might turn into 5 minutes and 15 seconds, or 4 minutes and 45 seconds, depending on how loaded the system is. This compounds over time so even though it is running *about* every 5 minutes, it might not run every time the small hand on your wall clock is over a number.

Alternatively, a passive check runs only when you tell it to (or when it is triggered, in the case of a SNMP trap). This is a better choice for if you need it to run exactly at 11:30.
Former Nagios employee
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

Re: File check assistance

Post by jkinning »

I was having some additional discussions and meetings and the reason the 11:30 time came about was because that is the time the files usually or should be there. If not, there is a problem.

So, I am now wondering if there is a way to check that all three files exist within one check and create a timeperiod for 11:30pm - 11:50pm.

Code: Select all

./check_nrpe -H eas1t -t 30 -c CheckFiles -a path="e:\\eas\server\batch\import\fwia\pamalm.txt" max-dir-depth=0 MinCrit=0
./check_nrpe -H eas1t -t 30 -c CheckFiles -a path="e:\\eas\server\batch\import\fwia\pamhold.txt" max-dir-depth=0 MinCrit=0
./check_nrpe -H eas1t -t 30 -c CheckFiles -a path="e:\\eas\server\batch\import\fwia\pamsec.txt" max-dir-depth=0 MinCrit=0
When all three files are there per the service check the notification will be sent out. This would eliminate the manual task they are currently doing by logging into the VPN each night and checking to see if the 3 files are present or not.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: File check assistance

Post by lmiltchev »

You can easily create a custom timeperiod under the CCM and add it to your services. As for the three checks - you could create a new BPI group, add the three checks to it, and set your warning & critical health thresholds to 100%. If any of the checks fails, the group's health would change to "Critical". Then, you could run the BPI wizard against the BPI group, and start monitoring the group's health. Hope this helps.
Be sure to check out our Knowledgebase for helpful articles and solutions!
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

Re: File check assistance

Post by jkinning »

I am giving the BPI method a try since I was unable to figure out anything easier. Maybe this is easier I just am unfamiliar with it.

So, I created the BPI group with the three services and defined them like you had mentioned with 100% and I did get a notification. A couple items I'm not sure on is the IP is 127.0.0.1 should I change that to the actual host in which the files should be present or is that normal for the BPI group? Also, since I have the three services with the notification time of 23:29-23:31 M-F do I also need to use that timeperiod for the BPI group?

Again, I am just trying to get a notification from Nagios around 11:30pm to let me know that these three files are present. Eliminating a manual process today where people login and verify.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: File check assistance

Post by lmiltchev »

The host's IP should be 127.0.0.1 as this is a "dummy" host, so no, you don't need to change it. If you need to be notified only during a custom timeperiod, you will need to modify the notificaiton period on your service (CCM->Services-><your service>->Alert Settings). Can you show us the service's config?

CCM->Services->select the "BPI config" from the "Filter by Config Name" drop-down menu, click on the "View Text Config" (the diskette icon), and copy/paste the config.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked