"Cannot find file" error on host
Re: "Cannot find file" error on host
What do you actually have for "PATH TO LIBRARY" in the perl code? That error output makes me think Nagios can't actually exec your plugin, or that one of the things the plugin relies on cannot be found.
Former Nagios employee
Re: "Cannot find file" error on host
I looked at the Perl and it looks fine, but I just realized something:
This MAY be trying to write to a place you're not expecting, since you only set $getFile to be "ftptest" without being in a specific directory. I would recommend "cd /tmp" before getting the file from FTP, to make sure that your running process has the ability to write to the current directory.
Code: Select all
$ftp->get($getFile);
-
- Posts: 43
- Joined: Tue Jul 15, 2014 6:58 pm
Re: "Cannot find file" error on host
It's possible I don't. I'm not sure to be honest. In my if statements, I've written "exit 0;" for OK, and "exit 2;" for critical.sreinhardt wrote:Do you output a message to standard out in your plugin? (sorry didnt read through it all as I am by no means a perl guy) It seems to me that the error is one of two things:
more file permissions issues with the ftp script. (not likely as you can run it via su - nagios)
the plugin does not output to stdout, which nagios expects or it will give a very similar message.
Here is the part of my script that should be picked up by Nagios:
Code: Select all
#Nagios logic
if ($fileOut == $date) {
print "OK - FTP Services Working\n";
exit 0; #Nagios OK return code
}
else {
print "CRITICAL - FTP services degraded\n";
exit 2; #Nagios CRITICAL return code
}
It's a path to the Try::Tiny module. It's not used actually, it was from another way I was trying to get this to work earlier.tmcdonald wrote:What do you actually have for "PATH TO LIBRARY" in the perl code? That error output makes me think Nagios can't actually exec your plugin, or that one of the things the plugin relies on cannot be found.
Apparently when you don't specify a directory to write the getfile to, it just drops it into the root directory. So it's placing ftproot in /ftproot.eloyd wrote:I looked at the Perl and it looks fine, but I just realized something:This MAY be trying to write to a place you're not expecting, since you only set $getFile to be "ftptest" without being in a specific directory. I would recommend "cd /tmp" before getting the file from FTP, to make sure that your running process has the ability to write to the current directory.Code: Select all
$ftp->get($getFile);
Re: "Cannot find file" error on host
I really think you're looking at a problem with trying to drop the file in a place that you don't have privileges to do so. Remember - you're doing your testing as the root user but your script runs as the nagios user. You're assuming where the file will be written, and it's possible that it can't be written by nagios.
Try changing your script so that the FTP GET operation is performed AFTER cd'ing to /tmp and see if that makes a difference.
Try changing your script so that the FTP GET operation is performed AFTER cd'ing to /tmp and see if that makes a difference.
-
- Posts: 43
- Joined: Tue Jul 15, 2014 6:58 pm
Re: "Cannot find file" error on host
This is what I've done now to specify the directory to look for the received file:eloyd wrote:I really think you're looking at a problem with trying to drop the file in a place that you don't have privileges to do so. Remember - you're doing your testing as the root user but your script runs as the nagios user. You're assuming where the file will be written, and it's possible that it can't be written by nagios.
Try changing your script so that the FTP GET operation is performed AFTER cd'ing to /tmp and see if that makes a difference.
Code: Select all
$ftp->get($getFile, "/usr/local/nagios/Misc/test/ftptest");
Code: Select all
open FILE, "/usr/local/nagios/Misc/test/ftptest" or die $!;
Someone mentioned that I may not be outputting anything to stdout, and that if not, you get an error very similar to this. Does that sound like it could be an issue based on my script?
Re: "Cannot find file" error on host
Not to me, it doesn't. At this point, I think you need to turn on Nagios log file debugging and see what the exact command and output are that show up in the Nagios log.
-
- Posts: 43
- Joined: Tue Jul 15, 2014 6:58 pm
Re: "Cannot find file" error on host
Okay so I enable the debugger with the -1 value and number 2 detail value, and this is what I get for my FTP check:eloyd wrote:Not to me, it doesn't. At this point, I think you need to turn on Nagios log file debugging and see what the exact command and output are that show up in the Nagios log.
Code: Select all
EVENT_HOST_CHECK, Run Time: Fri Aug 22 15:25:00 2014
[1408746301.669831] [008.0] [pid=2966] ** Host Check Event ==> Host: 'FTP Health Check', Options: 1, Latency: 0.000057 sec
[1408746301.669841] [001.0] [pid=2966] run_scheduled_host_check()
[1408746301.669845] [016.0] [pid=2966] Attempting to run scheduled check of host 'FTP Health Check': check options=1, latency=0.000057
[1408746301.669850] [001.0] [pid=2966] run_async_host_check(FTP Health Check ...)
[1408746301.669855] [016.0] [pid=2966] ** Running async check of host 'FTP Health Check'...
[1408746301.669859] [016.0] [pid=2966] Host 'FTP Health Check' passed first hurdle (caching/execution)
[1408746301.669872] [001.0] [pid=2966] check_host_check_viability()
[1408746301.669878] [064.1] [pid=2966] Making callbacks (type 7)...
[1408746301.669883] [016.0] [pid=2966] Checking host 'FTP Health Check'...
[1408746301.669889] [001.0] [pid=2966] adjust_host_check_attempt()
[1408746301.669893] [016.2] [pid=2966] Adjusting check attempt number for host 'FTP Health Check': current attempt=2/2, state=1, state type=1
[1408746301.669897] [016.2] [pid=2966] New check attempt number = 1
[1408746301.669904] [001.0] [pid=2966] get_raw_command_line_r()
[1408746301.669908] [2320.2] [pid=2966] Raw Command Input: $USER1$/ftp.pl
[1408746301.669913] [2320.2] [pid=2966] Expanded Command Output: $USER1$/ftp.pl
[1408746301.669918] [001.0] [pid=2966] process_macros_r()
[1408746301.669922] [2048.1] [pid=2966] **** BEGIN MACRO PROCESSING ***********
[1408746301.669926] [2048.1] [pid=2966] Processing: '$USER1$/ftp.pl'
[1408746301.669931] [2048.2] [pid=2966] Processing part: ''
[1408746301.669935] [2048.2] [pid=2966] Not currently in macro. Running output (0): ''
[1408746301.669940] [2048.2] [pid=2966] Processing part: 'USER1'
[1408746301.669945] [2048.2] [pid=2966] Processed 'USER1', Free: 0
[1408746301.669949] [2048.2] [pid=2966] Processed 'USER1', Free: 0, Cleaning options: 3
[1408746301.669954] [2048.2] [pid=2966] Uncleaned macro. Running output (25): '/usr/local/nagios/libexec'
[1408746301.669958] [2048.2] [pid=2966] Just finished macro. Running output (25): '/usr/local/nagios/libexec'
[1408746301.669962] [2048.2] [pid=2966] Processing part: '/ftp.pl'
[1408746301.669967] [2048.2] [pid=2966] Not currently in macro. Running output (32): '/usr/local/nagios/libexec/ftp.pl'
[1408746301.669971] [2048.1] [pid=2966] Done. Final output: '/usr/local/nagios/libexec/ftp.pl'
[1408746301.669975] [2048.1] [pid=2966] **** END MACRO PROCESSING *************
[1408746301.669981] [064.1] [pid=2966] Making callbacks (type 7)...
[1408746301.669989] [001.0] [pid=2966] macros_to_kvv()
[1408746301.670000] [001.0] [pid=2966] clear_volatile_macros_r()
[1408746301.670006] [001.0] [pid=2966] handle_timed_event() end
[1408746301.670010] [064.1] [pid=2966] Making callbacks (type 1)...
[1408746301.670015] [008.1] [pid=2966] ** Event Check Loop
[1408746301.670023] [008.1] [pid=2966] Next Event Time: Fri Aug 22 15:25:02 2014
[1408746301.670027] [008.1] [pid=2966] Current/Max Service Checks: 0/0 (-nan% saturation)
[1408746301.670033] [12288.1] [pid=2966] ## Polling 1020ms; sockets=6; events=209; iobs=0x1b73140
[1408746301.670426] [016.2] [pid=2966] Processing check result for host 'FTP Health Check'
[1408746301.670436] [001.0] [pid=2966] handle_async_host_check_result(FTP Health Check ...)
[1408746301.670441] [016.1] [pid=2966] ** Handling async check result for host 'FTP Health Check' from 'Core Worker 2970'...
[1408746301.670445] [016.2] [pid=2966] Check Type: Active
[1408746301.670449] [016.2] [pid=2966] Check Options: 1
[1408746301.670453] [016.2] [pid=2966] Scheduled Check?: Yes
[1408746301.670457] [016.2] [pid=2966] Reschedule Check?: Yes
[1408746301.670461] [016.2] [pid=2966] Exited OK?: Yes
[1408746301.670465] [016.2] [pid=2966] Exec Time: 0.000
[1408746301.670471] [016.2] [pid=2966] Latency: 0.000
[1408746301.670476] [016.2] [pid=2966] Return Status: 2
[1408746301.670480] [016.2] [pid=2966] Output: (No output on stdout) stderr: execvp(/usr/local/nagios/libexec/ftp.pl, ...) failed. errno is 2: No such file or directory
[1408746301.670487] [016.2] [pid=2966] Parsing check output...
[1408746301.670491] [016.2] [pid=2966] Short Output: (No output on stdout) stderr: execvp(/usr/local/nagios/libexec/ftp.pl, ...) failed. errno is 2: No such file or directory
[1408746301.670495] [016.2] [pid=2966] Long Output: NULL
[1408746301.670500] [016.2] [pid=2966] Perf Data: NULL
[1408746301.670503] [001.0] [pid=2966] get_host_check_return_code()
[1408746301.670508] [001.0] [pid=2966] process_host_check_result()
[1408746301.670511] [016.1] [pid=2966] HOST: FTP Health Check, ATTEMPT=1/2, CHECK TYPE=ACTIVE, STATE TYPE=HARD, OLD STATE=1, NEW STATE=1
[1408746301.670520] [016.1] [pid=2966] Host was DOWN.
[1408746301.670525] [016.1] [pid=2966] Host is still DOWN.
[1408746301.670530] [001.0] [pid=2966] determine_host_reachability(host=FTP Health Check)
[1408746301.670534] [016.2] [pid=2966] Determining state of host 'FTP Health Check': current state=1 (DOWN)
[1408746301.670538] [016.2] [pid=2966] Host has no parents, so it is DOWN.
[1408746301.670542] [016.1] [pid=2966] Pre-handle_host_state() Host: FTP Health Check, Attempt=1/2, Type=HARD, Final State=1 (DOWN)
[1408746301.670547] [001.0] [pid=2966] handle_host_state()
[1408746301.670553] [001.0] [pid=2966] obsessive_compulsive_host_check_processor()
[1408746301.670563] [032.0] [pid=2966] ** Host Notification Attempt ** Host: 'FTP Health Check', Type: NORMAL, Options: 0, Current State: 1, Last Notification: Wed Aug 20 17:06:56 2014
[1408746301.670570] [001.0] [pid=2966] check_host_notification_viability()
[1408746301.670576] [001.0] [pid=2966] check_time_against_period()
[1408746301.670582] [001.0] [pid=2966] _get_matching_timerange()
[1408746301.670588] [032.1] [pid=2966] Notifications are temporarily disabled for this host, so we won't send one out.
[1408746301.670594] [032.0] [pid=2966] Notification viability test failed. No notification will be sent out.
[1408746301.670599] [016.1] [pid=2966] Post-handle_host_state() Host: FTP Health Check, Attempt=1/2, Type=HARD, Final State=1 (DOWN)
[1408746301.670603] [001.0] [pid=2966] check_for_host_flapping()
[1408746301.670607] [016.1] [pid=2966] Checking host 'FTP Health Check' for flapping...
[1408746301.670612] [016.2] [pid=2966] LFT=5.00, HFT=20.00, CPC=0.00, PSC=0.00%
[1408746301.670619] [016.1] [pid=2966] Host is not flapping (0.00% state change).
[1408746301.670626] [016.1] [pid=2966] Rescheduling next check of host at Fri Aug 22 15:26:01 2014
[1408746301.670631] [001.0] [pid=2966] get_next_valid_time()
[1408746301.670636] [001.0] [pid=2966] _get_matching_timerange()
[1408746301.670643] [001.0] [pid=2966] schedule_host_check()
[1408746301.670649] [016.0] [pid=2966] Scheduling a non-forced, active check of host 'FTP Health Check' @ Fri Aug 22 15:26:01 2014
[1408746301.670654] [016.2] [pid=2966] Scheduling new host check event.
[1408746301.670658] [001.0] [pid=2966] add_event()
[1408746301.670664] [064.1] [pid=2966] Making callbacks (type 12)...
[1408746301.670669] [064.1] [pid=2966] Making callbacks (type 12)...
[1408746301.670674] [016.1] [pid=2966] ** Async check result for host 'FTP Health Check' handled: new state=1
Re: "Cannot find file" error on host
So you're getting a critical (Return Status: 2) but not the output. Here's my suggestion:
Get to the core of what you're trying to determine is working or not working by running the check, and then figure out a different way to do it. You may want to do a shell script instead of a perl script, for instance, just to make sure you're not running into perl errors.
I'm honestly out of ideas here.
Get to the core of what you're trying to determine is working or not working by running the check, and then figure out a different way to do it. You may want to do a shell script instead of a perl script, for instance, just to make sure you're not running into perl errors.
I'm honestly out of ideas here.
Re: "Cannot find file" error on host
I'll take a swing since I have an unhealthy relationship with perl...
The perl script has a few issues:
Missing leading slash there. Should probably be:
The \n here probably should be removed:
I'd suggest adding an 'or die' too:
I doubt this is the issue, but you may want to explicitly define the full path on the ftp server (and change the error from 'cant connect to $dir' to 'cannot change dir to $dir on remote ftp' or similar.
I will assume the dir 'Test' exists in the home directory your ftp server dumps you in when logging in as $user.
This:
I'm not sure this is entirely proper. You are using == for a string comparison; it probably will work because perl is amusing that way. You probably need to be using either 'eq' for your operator or you should convert the date strings back (I like to just convert to epoch seconds so I can use numeric comparison). I doubt this is related to your current issue since it is probably working anyways.
Mostly you need to be handling the get, puts and change of working directories on the FTP side with 'or die' bits so you can get some output when it craters. I tested with the script supplied at the end of this post as the nagios user and I'm getting your expected results. You may want to temporarily enable the nagios user's shell so you can just be nagios when testing:
Set it to /bin/bash (assuming you like /bin/bash), it'll look like this:
Here is how to put it back when done testing, but set the shell to /sbin/nologin (really, put it back to nologin).
Changing the shell for the nagios user would let you then just
to become the nagios user to test your scripts and write permissions. (which you get the same info using the 'su -c' method, which is safer; this just might make a permission/ownership issue easier to spot when testing)
I used this to headcheck the script run, confirm that the date was changing, and the modtime/ownerships of the file were correct:
(note the 'nagios nagios' in the ownership - if your file has 'root' there, you've likely got a file from one of your test runs you did as 'root' earlier). You may want to try removing /usr/local/nagios/Misc/test/ftptest and then try running the script as the nagios user to see if the file gets written.
Consider not using 'check_ftp' as the name for your custom ftp command (it's one of the samples I see in my conf files). I'd expect you'd be seeing a conflict with the commands when you do your nagios config check if this is your issue. Just for reference, this is the nagios command for check_ftp that I usually see defined:
Copy of your script with a few updates that worked for me (and a few suggestions in line):
Edit1: Believe it or not, some rambling removed.
Edit2: A few typos in varnames
The perl script has a few issues:
Code: Select all
#!usr/bin/perl
Code: Select all
#!/usr/bin/perl
Code: Select all
open (FILE, ">$putFile\n");
Code: Select all
open (FILE,">$putFile") or die "I dont wanna make a file called $putFile\n";
Code: Select all
$ftp->cwd($dir) or die "Can't connect to $dir\n";
I will assume the dir 'Test' exists in the home directory your ftp server dumps you in when logging in as $user.
This:
Code: Select all
if ($fileOut == $date) {
Mostly you need to be handling the get, puts and change of working directories on the FTP side with 'or die' bits so you can get some output when it craters. I tested with the script supplied at the end of this post as the nagios user and I'm getting your expected results. You may want to temporarily enable the nagios user's shell so you can just be nagios when testing:
Code: Select all
chsh nagios
Code: Select all
[root@yourserver somedirectory]# chsh nagios
Changing shell for nagios.
New shell [/sbin/nologin]: /bin/bash
Shell changed.
Code: Select all
[root@yourserver somedirectory]# chsh nagios
Changing shell for nagios.
New shell [/bin/bash]: /sbin/nologin
Shell changed.
Code: Select all
su nagios
I used this to headcheck the script run, confirm that the date was changing, and the modtime/ownerships of the file were correct:
Code: Select all
/usr/local/nagios/libexec/ftp.pl;cat /usr/local/nagios/Misc/test/ftptest;ls -aslht /usr/local/nagios/Misc/test/ftptest
Code: Select all
bash-4.1$ /usr/local/nagios/libexec/ftp.pl;cat /usr/local/nagios/Misc/test/ftptest;ls -aslht /usr/local/nagios/Misc/test/ftptest
OK - FTP Services Working
08/24/2014 02:09
4.0K -rw-r--r-- 1 nagios nagios 17 Aug 24 02:09 /usr/local/nagios/Misc/test/ftptest
Consider not using 'check_ftp' as the name for your custom ftp command (it's one of the samples I see in my conf files). I'd expect you'd be seeing a conflict with the commands when you do your nagios config check if this is your issue. Just for reference, this is the nagios command for check_ftp that I usually see defined:
Code: Select all
# 'check_ftp' command definition
define command{
command_name check_ftp
command_line $USER1$/check_ftp -H $HOSTADDRESS$ $ARG1$
}
Copy of your script with a few updates that worked for me (and a few suggestions in line):
Code: Select all
#!/usr/bin/perl
use Net::FTP;
use Time::Piece;
$host = "myhost";
$user = "myuser";
$pw = "cpass";
$dir = "Test"; #ftpserver remotedir
$getFile = "ftptest";
$getFilelocal = "/usr/local/nagios/Misc/test/ftptest";
$putFile = "/usr/local/nagios/Misc/ftptest";
$date = localtime->strftime('%m/%d/%Y %H:%M');
#Writes current date to file for nagios checking
open (FILE, ">$putFile") or die "I dont wanna make a file called $putFile\n";
print FILE "$date\n"; #if you kill the \n here, you don't need to chomp when you read the file later
close (FILE);
#Connects to FTP directory
$ftp = Net::FTP->new($host) or die "Can't open $host\n";
$ftp->login($user, $pw) or die "Can't login with $user\n";
$ftp->cwd($dir) or die "Can't changedir to $dir on $host\n"; #make this more accurate
#Sends to directory, gets from directory
$ftp->put($putFile) or die "Cannot put file $putfile on $host ",$ftp->message; # homework - change the 'die' messages to 'warn' and modify your exit to '3' so nagios goes unknown
$ftp->get($getFile, "$getFilelocal") or die "Cannot get file $getfile on $host to $getFilelocal", $ftp->message;
#Reads date from file to make sure it matches $date
open FILE, "$getFilelocal" or die $!;
while(<FILE>){
chomp;
$fileOut = $_;
}
#Nagios logic
# - this comparison should probably be 'eq' since this is a string comparison as currently written (better would be to convert the date to epoch)
if ($fileOut == $date) {
print "OK - FTP Services Working\n";
exit 0; #Nagios OK return code
}
else {
print "CRITICAL - FTP services degraded\n";
exit 2; #Nagios CRITICAL return code
}
Edit1: Believe it or not, some rambling removed.
Edit2: A few typos in varnames
Re: "Cannot find file" error on host
This would account for the error from the log:millisa wrote:I'll take a swing since I have an unhealthy relationship with perl...
The perl script has a few issues:Missing leading slash there. Should probably be:Code: Select all
#!usr/bin/perl
Code: Select all
#!/usr/bin/perl
Code: Select all
execvp(/usr/local/nagios/libexec/ftp.pl, ...) failed. errno is 2: No such file or directory
(Edit: Also thanks for the thorough posting millisa.)