"Cannot find file" error on host

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: "Cannot find file" error on host

Post by tmcdonald »

What do you actually have for "PATH TO LIBRARY" in the perl code? That error output makes me think Nagios can't actually exec your plugin, or that one of the things the plugin relies on cannot be found.
Former Nagios employee
User avatar
eloyd
Cool Title Here
Posts: 2129
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: "Cannot find file" error on host

Post by eloyd »

I looked at the Perl and it looks fine, but I just realized something:

Code: Select all

$ftp->get($getFile);
This MAY be trying to write to a place you're not expecting, since you only set $getFile to be "ftptest" without being in a specific directory. I would recommend "cd /tmp" before getting the file from FTP, to make sure that your running process has the ability to write to the current directory.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoydI'm a Nagios Fanatic!
logic_bomb421
Posts: 43
Joined: Tue Jul 15, 2014 6:58 pm

Re: "Cannot find file" error on host

Post by logic_bomb421 »

sreinhardt wrote:Do you output a message to standard out in your plugin? (sorry didnt read through it all as I am by no means a perl guy) It seems to me that the error is one of two things:

more file permissions issues with the ftp script. (not likely as you can run it via su - nagios)
the plugin does not output to stdout, which nagios expects or it will give a very similar message.
It's possible I don't. I'm not sure to be honest. In my if statements, I've written "exit 0;" for OK, and "exit 2;" for critical.

Here is the part of my script that should be picked up by Nagios:

Code: Select all

#Nagios logic
if ($fileOut == $date) {
   print "OK - FTP Services Working\n";
   exit 0; #Nagios OK return code
}
else {
   print "CRITICAL - FTP services degraded\n";
   exit 2; #Nagios CRITICAL return code
}
When I run this via the terminal, I do see "OK - FTP Services Working" right after I hit enter.
tmcdonald wrote:What do you actually have for "PATH TO LIBRARY" in the perl code? That error output makes me think Nagios can't actually exec your plugin, or that one of the things the plugin relies on cannot be found.
It's a path to the Try::Tiny module. It's not used actually, it was from another way I was trying to get this to work earlier.
eloyd wrote:I looked at the Perl and it looks fine, but I just realized something:

Code: Select all

$ftp->get($getFile);
This MAY be trying to write to a place you're not expecting, since you only set $getFile to be "ftptest" without being in a specific directory. I would recommend "cd /tmp" before getting the file from FTP, to make sure that your running process has the ability to write to the current directory.
Apparently when you don't specify a directory to write the getfile to, it just drops it into the root directory. So it's placing ftproot in /ftproot.
User avatar
eloyd
Cool Title Here
Posts: 2129
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: "Cannot find file" error on host

Post by eloyd »

I really think you're looking at a problem with trying to drop the file in a place that you don't have privileges to do so. Remember - you're doing your testing as the root user but your script runs as the nagios user. You're assuming where the file will be written, and it's possible that it can't be written by nagios.

Try changing your script so that the FTP GET operation is performed AFTER cd'ing to /tmp and see if that makes a difference.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoydI'm a Nagios Fanatic!
logic_bomb421
Posts: 43
Joined: Tue Jul 15, 2014 6:58 pm

Re: "Cannot find file" error on host

Post by logic_bomb421 »

eloyd wrote:I really think you're looking at a problem with trying to drop the file in a place that you don't have privileges to do so. Remember - you're doing your testing as the root user but your script runs as the nagios user. You're assuming where the file will be written, and it's possible that it can't be written by nagios.

Try changing your script so that the FTP GET operation is performed AFTER cd'ing to /tmp and see if that makes a difference.
This is what I've done now to specify the directory to look for the received file:

Code: Select all

$ftp->get($getFile, "/usr/local/nagios/Misc/test/ftptest");
and for the reading part:

Code: Select all

open FILE, "/usr/local/nagios/Misc/test/ftptest" or die $!;
Even specifying the directories to look in, I still get the same error. Again, running this with "su nagios -c "[...]"" returns "OK- FTP Service Working", and writes and reads the files just fine.

Someone mentioned that I may not be outputting anything to stdout, and that if not, you get an error very similar to this. Does that sound like it could be an issue based on my script?
User avatar
eloyd
Cool Title Here
Posts: 2129
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: "Cannot find file" error on host

Post by eloyd »

Not to me, it doesn't. At this point, I think you need to turn on Nagios log file debugging and see what the exact command and output are that show up in the Nagios log.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoydI'm a Nagios Fanatic!
logic_bomb421
Posts: 43
Joined: Tue Jul 15, 2014 6:58 pm

Re: "Cannot find file" error on host

Post by logic_bomb421 »

eloyd wrote:Not to me, it doesn't. At this point, I think you need to turn on Nagios log file debugging and see what the exact command and output are that show up in the Nagios log.
Okay so I enable the debugger with the -1 value and number 2 detail value, and this is what I get for my FTP check:

Code: Select all

EVENT_HOST_CHECK, Run Time: Fri Aug 22 15:25:00 2014
[1408746301.669831] [008.0] [pid=2966] ** Host Check Event ==> Host: 'FTP Health Check', Options: 1, Latency: 0.000057 sec
[1408746301.669841] [001.0] [pid=2966] run_scheduled_host_check()
[1408746301.669845] [016.0] [pid=2966] Attempting to run scheduled check of host 'FTP Health Check': check options=1, latency=0.000057
[1408746301.669850] [001.0] [pid=2966] run_async_host_check(FTP Health Check ...)
[1408746301.669855] [016.0] [pid=2966] ** Running async check of host 'FTP Health Check'...
[1408746301.669859] [016.0] [pid=2966] Host 'FTP Health Check' passed first hurdle (caching/execution)
[1408746301.669872] [001.0] [pid=2966] check_host_check_viability()
[1408746301.669878] [064.1] [pid=2966] Making callbacks (type 7)...
[1408746301.669883] [016.0] [pid=2966] Checking host 'FTP Health Check'...
[1408746301.669889] [001.0] [pid=2966] adjust_host_check_attempt()
[1408746301.669893] [016.2] [pid=2966] Adjusting check attempt number for host 'FTP Health Check': current attempt=2/2, state=1, state type=1
[1408746301.669897] [016.2] [pid=2966] New check attempt number = 1
[1408746301.669904] [001.0] [pid=2966] get_raw_command_line_r()
[1408746301.669908] [2320.2] [pid=2966] Raw Command Input: $USER1$/ftp.pl
[1408746301.669913] [2320.2] [pid=2966] Expanded Command Output: $USER1$/ftp.pl
[1408746301.669918] [001.0] [pid=2966] process_macros_r()
[1408746301.669922] [2048.1] [pid=2966] **** BEGIN MACRO PROCESSING ***********
[1408746301.669926] [2048.1] [pid=2966] Processing: '$USER1$/ftp.pl'
[1408746301.669931] [2048.2] [pid=2966]   Processing part: ''
[1408746301.669935] [2048.2] [pid=2966]   Not currently in macro.  Running output (0): ''
[1408746301.669940] [2048.2] [pid=2966]   Processing part: 'USER1'
[1408746301.669945] [2048.2] [pid=2966]   Processed 'USER1', Free: 0
[1408746301.669949] [2048.2] [pid=2966]   Processed 'USER1', Free: 0,  Cleaning options: 3
[1408746301.669954] [2048.2] [pid=2966]   Uncleaned macro.  Running output (25): '/usr/local/nagios/libexec'
[1408746301.669958] [2048.2] [pid=2966]   Just finished macro.  Running output (25): '/usr/local/nagios/libexec'
[1408746301.669962] [2048.2] [pid=2966]   Processing part: '/ftp.pl'
[1408746301.669967] [2048.2] [pid=2966]   Not currently in macro.  Running output (32): '/usr/local/nagios/libexec/ftp.pl'
[1408746301.669971] [2048.1] [pid=2966]   Done.  Final output: '/usr/local/nagios/libexec/ftp.pl'
[1408746301.669975] [2048.1] [pid=2966] **** END MACRO PROCESSING *************
[1408746301.669981] [064.1] [pid=2966] Making callbacks (type 7)...
[1408746301.669989] [001.0] [pid=2966] macros_to_kvv()
[1408746301.670000] [001.0] [pid=2966] clear_volatile_macros_r()
[1408746301.670006] [001.0] [pid=2966] handle_timed_event() end
[1408746301.670010] [064.1] [pid=2966] Making callbacks (type 1)...
[1408746301.670015] [008.1] [pid=2966] ** Event Check Loop
[1408746301.670023] [008.1] [pid=2966] Next Event Time: Fri Aug 22 15:25:02 2014
[1408746301.670027] [008.1] [pid=2966] Current/Max Service Checks: 0/0 (-nan% saturation)
[1408746301.670033] [12288.1] [pid=2966] ## Polling 1020ms; sockets=6; events=209; iobs=0x1b73140
[1408746301.670426] [016.2] [pid=2966] Processing check result for host 'FTP Health Check'
[1408746301.670436] [001.0] [pid=2966] handle_async_host_check_result(FTP Health Check ...)
[1408746301.670441] [016.1] [pid=2966] ** Handling async check result for host 'FTP Health Check' from 'Core Worker 2970'...
[1408746301.670445] [016.2] [pid=2966] 	Check Type:         Active
[1408746301.670449] [016.2] [pid=2966] 	Check Options:      1
[1408746301.670453] [016.2] [pid=2966] 	Scheduled Check?:   Yes
[1408746301.670457] [016.2] [pid=2966] 	Reschedule Check?:  Yes
[1408746301.670461] [016.2] [pid=2966] 	Exited OK?:         Yes
[1408746301.670465] [016.2] [pid=2966] 	Exec Time:          0.000
[1408746301.670471] [016.2] [pid=2966] 	Latency:            0.000
[1408746301.670476] [016.2] [pid=2966] 	Return Status:      2
[1408746301.670480] [016.2] [pid=2966] 	Output:             (No output on stdout) stderr: execvp(/usr/local/nagios/libexec/ftp.pl, ...) failed. errno is 2: No such file or directory

[1408746301.670487] [016.2] [pid=2966] Parsing check output...
[1408746301.670491] [016.2] [pid=2966] Short Output: (No output on stdout) stderr: execvp(/usr/local/nagios/libexec/ftp.pl, ...) failed. errno is 2: No such file or directory
[1408746301.670495] [016.2] [pid=2966] Long Output:  NULL
[1408746301.670500] [016.2] [pid=2966] Perf Data:    NULL
[1408746301.670503] [001.0] [pid=2966] get_host_check_return_code()
[1408746301.670508] [001.0] [pid=2966] process_host_check_result()
[1408746301.670511] [016.1] [pid=2966] HOST: FTP Health Check, ATTEMPT=1/2, CHECK TYPE=ACTIVE, STATE TYPE=HARD, OLD STATE=1, NEW STATE=1
[1408746301.670520] [016.1] [pid=2966] Host was DOWN.
[1408746301.670525] [016.1] [pid=2966] Host is still DOWN.
[1408746301.670530] [001.0] [pid=2966] determine_host_reachability(host=FTP Health Check)
[1408746301.670534] [016.2] [pid=2966] Determining state of host 'FTP Health Check': current state=1 (DOWN)
[1408746301.670538] [016.2] [pid=2966] Host has no parents, so it is DOWN.
[1408746301.670542] [016.1] [pid=2966] Pre-handle_host_state() Host: FTP Health Check, Attempt=1/2, Type=HARD, Final State=1 (DOWN)
[1408746301.670547] [001.0] [pid=2966] handle_host_state()
[1408746301.670553] [001.0] [pid=2966] obsessive_compulsive_host_check_processor()
[1408746301.670563] [032.0] [pid=2966] ** Host Notification Attempt ** Host: 'FTP Health Check', Type: NORMAL, Options: 0, Current State: 1, Last Notification: Wed Aug 20 17:06:56 2014
[1408746301.670570] [001.0] [pid=2966] check_host_notification_viability()
[1408746301.670576] [001.0] [pid=2966] check_time_against_period()
[1408746301.670582] [001.0] [pid=2966] _get_matching_timerange()
[1408746301.670588] [032.1] [pid=2966] Notifications are temporarily disabled for this host, so we won't send one out.
[1408746301.670594] [032.0] [pid=2966] Notification viability test failed.  No notification will be sent out.
[1408746301.670599] [016.1] [pid=2966] Post-handle_host_state() Host: FTP Health Check, Attempt=1/2, Type=HARD, Final State=1 (DOWN)
[1408746301.670603] [001.0] [pid=2966] check_for_host_flapping()
[1408746301.670607] [016.1] [pid=2966] Checking host 'FTP Health Check' for flapping...
[1408746301.670612] [016.2] [pid=2966] LFT=5.00, HFT=20.00, CPC=0.00, PSC=0.00%
[1408746301.670619] [016.1] [pid=2966] Host is not flapping (0.00% state change).
[1408746301.670626] [016.1] [pid=2966] Rescheduling next check of host at Fri Aug 22 15:26:01 2014
[1408746301.670631] [001.0] [pid=2966] get_next_valid_time()
[1408746301.670636] [001.0] [pid=2966] _get_matching_timerange()
[1408746301.670643] [001.0] [pid=2966] schedule_host_check()
[1408746301.670649] [016.0] [pid=2966] Scheduling a non-forced, active check of host 'FTP Health Check' @ Fri Aug 22 15:26:01 2014
[1408746301.670654] [016.2] [pid=2966] Scheduling new host check event.
[1408746301.670658] [001.0] [pid=2966] add_event()
[1408746301.670664] [064.1] [pid=2966] Making callbacks (type 12)...
[1408746301.670669] [064.1] [pid=2966] Making callbacks (type 12)...
[1408746301.670674] [016.1] [pid=2966] ** Async check result for host 'FTP Health Check' handled: new state=1
I don't understand why it doesn't work. It seems like, from reading this, it should?
User avatar
eloyd
Cool Title Here
Posts: 2129
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: "Cannot find file" error on host

Post by eloyd »

So you're getting a critical (Return Status: 2) but not the output. Here's my suggestion:

Get to the core of what you're trying to determine is working or not working by running the check, and then figure out a different way to do it. You may want to do a shell script instead of a perl script, for instance, just to make sure you're not running into perl errors.

I'm honestly out of ideas here.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoydI'm a Nagios Fanatic!
User avatar
millisa
Posts: 69
Joined: Thu Jan 16, 2014 11:13 pm
Location: Austin, TX
Contact:

Re: "Cannot find file" error on host

Post by millisa »

I'll take a swing since I have an unhealthy relationship with perl...

The perl script has a few issues:

Code: Select all

#!usr/bin/perl
Missing leading slash there. Should probably be:

Code: Select all

#!/usr/bin/perl
The \n here probably should be removed:

Code: Select all

open (FILE, ">$putFile\n");
I'd suggest adding an 'or die' too:

Code: Select all

open (FILE,">$putFile") or die "I dont wanna make a file called $putFile\n";

Code: Select all

$ftp->cwd($dir) or die "Can't connect to $dir\n";
I doubt this is the issue, but you may want to explicitly define the full path on the ftp server (and change the error from 'cant connect to $dir' to 'cannot change dir to $dir on remote ftp' or similar.
I will assume the dir 'Test' exists in the home directory your ftp server dumps you in when logging in as $user.

This:

Code: Select all

if ($fileOut == $date) {
I'm not sure this is entirely proper. You are using == for a string comparison; it probably will work because perl is amusing that way. You probably need to be using either 'eq' for your operator or you should convert the date strings back (I like to just convert to epoch seconds so I can use numeric comparison). I doubt this is related to your current issue since it is probably working anyways.


Mostly you need to be handling the get, puts and change of working directories on the FTP side with 'or die' bits so you can get some output when it craters. I tested with the script supplied at the end of this post as the nagios user and I'm getting your expected results. You may want to temporarily enable the nagios user's shell so you can just be nagios when testing:

Code: Select all

chsh nagios
Set it to /bin/bash (assuming you like /bin/bash), it'll look like this:

Code: Select all

[root@yourserver somedirectory]# chsh nagios
Changing shell for nagios.
New shell [/sbin/nologin]: /bin/bash
Shell changed.
Here is how to put it back when done testing, but set the shell to /sbin/nologin (really, put it back to nologin).

Code: Select all

[root@yourserver somedirectory]# chsh nagios
Changing shell for nagios.
New shell [/bin/bash]: /sbin/nologin
Shell changed.
Changing the shell for the nagios user would let you then just

Code: Select all

su nagios
to become the nagios user to test your scripts and write permissions. (which you get the same info using the 'su -c' method, which is safer; this just might make a permission/ownership issue easier to spot when testing)

I used this to headcheck the script run, confirm that the date was changing, and the modtime/ownerships of the file were correct:

Code: Select all

/usr/local/nagios/libexec/ftp.pl;cat /usr/local/nagios/Misc/test/ftptest;ls -aslht /usr/local/nagios/Misc/test/ftptest

Code: Select all

bash-4.1$ /usr/local/nagios/libexec/ftp.pl;cat /usr/local/nagios/Misc/test/ftptest;ls -aslht /usr/local/nagios/Misc/test/ftptest
OK - FTP Services Working
08/24/2014 02:09
4.0K -rw-r--r-- 1 nagios nagios 17 Aug 24 02:09 /usr/local/nagios/Misc/test/ftptest
(note the 'nagios nagios' in the ownership - if your file has 'root' there, you've likely got a file from one of your test runs you did as 'root' earlier). You may want to try removing /usr/local/nagios/Misc/test/ftptest and then try running the script as the nagios user to see if the file gets written.

Consider not using 'check_ftp' as the name for your custom ftp command (it's one of the samples I see in my conf files). I'd expect you'd be seeing a conflict with the commands when you do your nagios config check if this is your issue. Just for reference, this is the nagios command for check_ftp that I usually see defined:

Code: Select all

# 'check_ftp' command definition
define command{
        command_name    check_ftp
        command_line    $USER1$/check_ftp -H $HOSTADDRESS$ $ARG1$
        }



Copy of your script with a few updates that worked for me (and a few suggestions in line):

Code: Select all

#!/usr/bin/perl

use Net::FTP;
use Time::Piece;

$host = "myhost";
$user = "myuser";
$pw =  "cpass";
$dir = "Test"; #ftpserver remotedir
$getFile = "ftptest";
$getFilelocal = "/usr/local/nagios/Misc/test/ftptest";
$putFile = "/usr/local/nagios/Misc/ftptest";
$date = localtime->strftime('%m/%d/%Y %H:%M');

#Writes current date to file for nagios checking
open (FILE, ">$putFile") or die "I dont wanna make a file called $putFile\n";
print FILE "$date\n";  #if you kill the \n here, you don't need to chomp when you read the file later
close (FILE);

#Connects to FTP directory
$ftp = Net::FTP->new($host) or die "Can't open $host\n";
$ftp->login($user, $pw) or die "Can't login with $user\n";
$ftp->cwd($dir) or die "Can't changedir to $dir on $host\n"; #make this more accurate

#Sends to directory, gets from directory
$ftp->put($putFile) or die "Cannot put file $putfile on $host ",$ftp->message; # homework - change the 'die' messages to 'warn' and modify your exit to '3' so nagios goes unknown
$ftp->get($getFile, "$getFilelocal") or die "Cannot get file $getfile on $host to $getFilelocal", $ftp->message;

#Reads date from file to make sure it matches $date
open FILE, "$getFilelocal" or die $!;
while(<FILE>){
   chomp;
   $fileOut = $_;
}

#Nagios logic  
# - this comparison should probably be 'eq' since this is a string comparison as currently written (better would be to convert the date to epoch)
if ($fileOut == $date) {
   print "OK - FTP Services Working\n";
   exit 0; #Nagios OK return code
}
else {
   print "CRITICAL - FTP services degraded\n";
   exit 2; #Nagios CRITICAL return code
}


Edit1: Believe it or not, some rambling removed.
Edit2: A few typos in varnames
emislivec
Posts: 52
Joined: Tue Feb 25, 2014 10:06 am

Re: "Cannot find file" error on host

Post by emislivec »

millisa wrote:I'll take a swing since I have an unhealthy relationship with perl...

The perl script has a few issues:

Code: Select all

#!usr/bin/perl
Missing leading slash there. Should probably be:

Code: Select all

#!/usr/bin/perl
This would account for the error from the log:

Code: Select all

execvp(/usr/local/nagios/libexec/ftp.pl, ...) failed. errno is 2: No such file or directory
This would happen if /usr/local/nagios/libexec/ftp.pl doesn't exist, or the interpreter given in the #! line doesn't exist.

(Edit: Also thanks for the thorough posting millisa.)
Locked