Page 1 of 2
Unix Log File Monitoring using Nagios XI
Posted: Sun Nov 03, 2019 8:59 pm
by Srinija544
Moderator Note: Content removed per request
Re: Unix Log File Monitoring using Nagios XI
Posted: Mon Nov 04, 2019 8:30 am
by scottwilkerson
We do have a whole program dedicated to this functionality, Nagios Log Server
https://www.nagios.com/products/nagios-log-server/
However if you hare having a problem using the plugin you gave, can you give an example of the command you are using and the problem you are seeing?
Re: Unix Log File Monitoring using Nagios XI
Posted: Mon Nov 04, 2019 8:37 pm
by Srinija544
Hi scottwilkerson,
Thank you for your response.
Please find the command which we are using to get this done:
command[check_file_content_dc1uat163]=/usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -i "ERROR|WARN|FATAL"
Output:
OK for /apps/wildfly-11.0.0/standalone/log/server.log.2019-11-04 (3581 found)
This is not appropriate for our requirement.
We need to get a critical alert when we see "ERROR|WARN|FATAL" in our Log File.
Please let me know if any further details are required in solving this issue.
Regards,
Srinija.
Re: Unix Log File Monitoring using Nagios XI
Posted: Mon Nov 04, 2019 8:44 pm
by scottwilkerson
Srinija544 wrote:We need to get a critical alert when we see "ERROR|WARN|FATAL" in our Log File.
Then I believe you want to change the -i to -e
Code: Select all
command[check_file_content_dc1uat163]=/usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -e "ERROR|WARN|FATAL"
restart nrpe
Re: Unix Log File Monitoring using Nagios XI
Posted: Wed Nov 06, 2019 9:14 pm
by Srinija544
Hi scottwilkerson,
I have replaced the command as mentioned.
Command:
command[check_file_content_dc1uat163]=/usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -e "ERROR|WARN|FATAL"
we are getting this alert.
UNKNOWN:
Usage : check_file_content.pl -f file -i include -e exclude -n lines_number [-h]
Options :
-f
Full path to file to analyze
-n
Number of lines to find (default is 1)
-i
Include pattern (can add multiple include)
-e
Exclude pattern (can add multiple include)
-h, --help
Print this help screen
Example : check_file_content.pl -f /etc/passwd -i 0 -e root -n 5
Please check and let me know for any further information.
Regards,
Srinija.
Re: Unix Log File Monitoring using Nagios XI
Posted: Thu Nov 07, 2019 7:35 am
by scottwilkerson
Can you run this from the CLI on the remote machine
Code: Select all
/usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -e "ERROR|WARN|FATAL"
If not lets try
Code: Select all
/usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -e "ERROR" -e "WARN" -e "FATAL"
Re: Unix Log File Monitoring using Nagios XI
Posted: Thu Nov 07, 2019 8:41 pm
by Srinija544
Please see the output after running two commands:
=====================================================================================================================
[root@dc1uat163 ~]# /usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -e "ERROR|WARN|FATAL"
Usage : check_file_content.pl -f file -i include -e exclude -n lines_number [-h]
Options :
-f
Full path to file to analyze
-n
Number of lines to find (default is 1)
-i
Include pattern (can add multiple include)
-e
Exclude pattern (can add multiple include)
-h, --help
Print this help screen
Example : check_file_content.pl -f /etc/passwd -i 0 -e root -n 5
[root@dc1uat163 ~]# /usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -e "ERROR" -e "WARN" -e "FATAL"
Usage : check_file_content.pl -f file -i include -e exclude -n lines_number [-h]
Options :
-f
Full path to file to analyze
-n
Number of lines to find (default is 1)
-i
Include pattern (can add multiple include)
-e
Exclude pattern (can add multiple include)
-h, --help
Print this help screen
Example : check_file_content.pl -f /etc/passwd -i 0 -e root -n 5
[root@dc1uat163 ~]#
Re: Unix Log File Monitoring using Nagios XI
Posted: Fri Nov 08, 2019 9:41 am
by scottwilkerson
I just read the code and it doesn't look like it does what you want, however if you make the following change to the plugin it should give you the desired results
change this line
to this
then run like you had it originally
Code: Select all
/usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -i "ERROR|WARN|FATAL"
Re: Unix Log File Monitoring using Nagios XI
Posted: Sun Nov 10, 2019 8:17 pm
by Srinija544
I have changed the script as mentioned but we have a problem. When i run this command:
Command:
===============================
/usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -i "ERROR|WARN|FATAL"
======================================
We are getting this output:
====================================
[root@dc1uat163 log]# /usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -i "ERROR|WARN|FATAL"
FAILED on /apps/wildfly-11.0.0/standalone/log/server.log.2019-11-10. Found only 270 on 1
====================================================
It means we have the count is 270 whether it may be ERROR or FATAL or WARN but when i check this separately the alert has to be OK if the pattern is not found in the file but this was not happening. Please find the below outputs:
===================================================
If i use only WARN as pattern:
output:
[root@dc1uat163 log]# /usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -i "WARN" FAILED on /apps/wildfly-11.0.0/standalone/log/server.log.2019-11-10. Found only 270 on 1
but for the other two patterns:
[root@dc1uat163 log]# /usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -i "FATAL"
FAILED on /apps/wildfly-11.0.0/standalone/log/server.log.2019-11-10
==============================================================
[root@dc1uat163 log]# /usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -i "ERROR"
FAILED on /apps/wildfly-11.0.0/standalone/log/server.log.2019-11-10
======================================================================================
If the mentioned patterns are not fond the file then the alert has to be in OK State but we are getting CRITICAL.
Please help me in correcting this issue.
Regards,
Srinija.
Re: Unix Log File Monitoring using Nagios XI
Posted: Mon Nov 11, 2019 8:25 am
by scottwilkerson
this is not our plugin so help from the author may be better, but to try to assist I rewrote a portion of the file, you can try this:
Code: Select all
#!/usr/bin/perl
#===============================================================================
#
# FILE: check_file_content.pl
#
# USAGE: ./check_file_content.pl
#
# DESCRIPTION: Nagios plugin to check file content
#
# OPTIONS: ---
# REQUIREMENTS: ---
# BUGS: ---
# NOTES: ---
# AUTHOR: Pierre Mavro (), [email protected]
# COMPANY:
# VERSION: 0.1
# CREATED: 10/05/2010 09:25:56
# REVISION: ---
#===============================================================================
use warnings;
use strict;
use Getopt::Long;
my %RETCODES = ('OK' => 0, 'WARNING' => 1, 'CRITICAL' => 2, 'UNKNOWN' => 3);
# Help
sub help
{
print "Usage : check_file_content.pl -f file -i include -e exclude -n lines_number [-h]\n\n";
print "Options :\n";
print " -f\n\tFull path to file to analyze\n";
print " -n\n\tNumber of lines to find (default is 1)\n";
print " -i\n\tInclude pattern (can add multiple include)\n";
print " -e\n\tExclude pattern (can add multiple include)\n";
print " -h, --help\n\tPrint this help screen\n";
print "\nExample : check_file_content.pl -f /etc/passwd -i 0 -e root -n 5\n";
exit $RETCODES{"UNKNOWN"};
}
sub check_args
{
help if !(defined(@ARGV));
my ($file,@include,@exclude);
my $num=1;
# Set options
GetOptions( "help|h" => \&help,
"f=s" => \$file,
"i=s" => \@include,
"e=s" => \@exclude,
"n=i" => \$num);
unless (($file) and (@include))
{
&help;
}
else
{
check_soft($file,$num,\@include,\@exclude);
}
}
sub check_soft
{
my $file=shift;
my $num=shift;
my $ref_include=shift;
my $ref_exclude=shift;
my @include = @$ref_include;
my @exclude = @$ref_exclude;
my $i=0;
if (!open(FILER, "<$file"))
{
print "Can't open $file: $!\n";
exit $RETCODES{"CRITICAL"};
}
while(<FILER>)
{
chomp($_);
my $line=$_;
my $found=0;
# Should match
foreach (@include)
{
if ($line =~ /$_/)
{
$found=1;
last;
}
}
# Shouldn't match
if (@exclude)
{
foreach (@exclude)
{
if ($line =~ /$_/)
{
$found=0;
last;
}
}
}
$i++ if ($found == 1);
}
close(FILER);
if ($i > 0)
{
if ($i < $num)
{
print "OK for $file ($i found)\n";
exit $RETCODES{"OK"};
}
else
{
print "FAILED on $file. Found only $i on $num\n";
exit $RETCODES{"CRITICAL"};
}
}
else
{
print "OK on $file\n";
exit $RETCODES{"OK"};
}
}
check_args;
And I believe with multiple args you need to run it like this
Code: Select all
/usr/local/nagios/libexec/check_file_content.pl -f /apps/wildfly-11.0.0/standalone/log/server.log.`date --date='yesterday' +"%Y-%m-%d"` -i "ERROR" -i "WARN" -i "FATAL"