Any ideas for a network interface traffic monitoring plugin that could detect if a NIC is overloaded?
That is: too many heavy networking programs running. I guess that implies some known time over which the traffic
is calculated so that short irregular peaks don't cause notifications.
I've be been checking out some plugins, but nothing is mentioned about time periods.
If a graphics template is found too, it would be really cool: After alarm, you could have a look into the load statistics..
Plugin that detects network overload
-
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Plugin that detects network overload
I don't think anything like this currently exists. You might check out bischeck, as it does store data from repeat checks and compares the differences between them, warning on too high of a delta between checks. Otherwise I think this is something that is a great idea, but would likely need to be developed.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: Plugin that detects network overload
Former Nagios employee
Re: Plugin that detects network overload
There is a little problem with bandwidth monitoring and using snmp for it. You would only get an average of the used bandwith as you would only poll the device once every 5 mins.
If your switches /devices support it, it would be better to use sFlow to collect your bandwidth statistics. Normally this can be done with an sFlow collector like sFlowtool. (http://www.inmon.com/bin/sflowtool-3.32.tar.gz)
This is a daemon which collects sFlow bandwidth data.
With the following script you could put this info in RRD files:
Then you would have your graphs
The following script could be used to find out your utilization for the interfaces.
In this script you could adjust the time in which you would like to calculate your averages.
I think I could make this into a nice project in http://exchange.nagios.org/
If your switches /devices support it, it would be better to use sFlow to collect your bandwidth statistics. Normally this can be done with an sFlow collector like sFlowtool. (http://www.inmon.com/bin/sflowtool-3.32.tar.gz)
This is a daemon which collects sFlow bandwidth data.
With the following script you could put this info in RRD files:
Code: Select all
#! /usr/bin/perl
# Copyright (c) 2001 InMon Corp. Licensed under the terms of the InMon sFlow licence:
# http://www.inmon.com/technology/sflowlicense.txt
#makes things work when run without install
use lib qw( ../perl-shared/blib/lib ../perl-shared/blib/arch );
#makes programm work AFTER install
use lib qw( /usr/local/rrdtool-1.0.33/lib/perl ../lib/perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/auto/RRDs );
use vars qw(@ISA $loaded);
use RRDs;
$dataDir="/var/spool/sflow/rrd";
$dataDir2="/var/spool/sflow/discards";
while(<>) {
($key, $value) = split /[\t =]+/, $_;
chomp $key;
chomp $value;
if($key eq "unixSecondsUTC") {$t = $value;}
if($key eq "agent") {$agent = $value;}
if($key eq "ifIndex") {$ifIndex = $value;}
if($key eq "ifSpeed") {$ifSpeed = $value;}
if($key eq "ifInOctets") {$ifInOctets = $value;}
if($key eq "ifOutOctets") {
$ifOutOctets = $value;
$rrd = "$dataDir/$agent-$ifIndex.rrd";
if(! -e $rrd) {
RRDs::create ($rrd, "--start",$t-1, "--step",20,
"DS:bytesIn:COUNTER:120:0:10000000000",
"DS:bytesOut:COUNTER:120:0:10000000000",
"RRA:MIN:0.5:1:44640",
"RRA:MAX:0.5:1:44640",
"RRA:AVERAGE:0.5:3:525600");
$ERROR = RRDs::error;
die "$0: unable to create `$rrd': $ERROR\n" if $ERROR;
}
if ($agent!="0.0.0.0"){
RRDs::update $rrd, "$t:$ifInOctets:$ifOutOctets";
}
if ($ERROR = RRDs::error) {
die "$0: unable to update `$rrd': $ERROR\n";
}
}
if($key eq "unixSecondsUTC") {$t = $value;}
if($key eq "agent") {$agent = $value;}
if($key eq "ifIndex") {$ifIndex = $value;}
if($key eq "ifInErrors") {$ifInErrors = $value;}
if($key eq "ifOutErrors") {$ifOutErrors = $value;}
if($key eq "ifInDiscards") {$ifInDiscards = $value;}
if($key eq "ifOutDiscards") {
$ifOutDiscards = $value;
$rrd2 = "$dataDir2/$agent-$ifIndex-discards.rrd";
if(! -e $rrd2) {
RRDs::create ($rrd2, "--start",$t-1, "--step",20,
"DS:DicardsIn:COUNTER:120:0:10000000",
"DS:ErrorsIn:COUNTER:120:0:10000000",
"DS:DicardsOut:COUNTER:120:0:10000000",
"DS:ErrorsOut:COUNTER:120:0:10000000",
"RRA:AVERAGE:0.5:3:576");
$ERROR = RRDs::error;
die "$0: unable to create `$rrd2': $ERROR\n" if $ERROR;
}
if ($agent!="0.0.0.0"){
RRDs::update $rrd2, "$t:$ifInDiscards:$ifInErrors:$ifOutDiscards:$ifOutErrors";
}
if ($ERROR = RRDs::error) {
die "$0: unable to update `$rrd2': $ERROR\n";
}
}
}
The following script could be used to find out your utilization for the interfaces.
Code: Select all
OK=0
WARNING=1
CRITICAL=2
ERROR=4
ip=$1
port=$2
if [ -z "$ip" -o -z "$port" ] || [ -n "$c" -a -z "$w" ]
then
echo usage: "$0 <ipaddress> <port number> [<linewidt> [<critical> <warning>]]"
exit $ERROR
fi
lw=$3
c=$4
w=$5
c=${c:-30}
w=${w:-25}
lw=${lw:=1G}
dir=/var/spool/sflow/rrd
case $lw in
1M)
lw=1000000
;;
10M)
lw=10000000
;;
100M)
lw=100000000
;;
1G)
lw=1000000000
;;
10G)
lw=10000000000
;;
100G)
lw=100000000000
;;
*)
echo linewidth value incorrect, mustby 1M, 10M, 100M, 1G, 10G or 100G
exit $ERROR
;;
esac
file=$dir/$ip-$port.rrd
if ! data=$(rrdtool fetch $file AVERAGE -s -360 -e -120 | tail -5)
then
echo Could not fetch rrd data from $file
exit $UNKNOWN
fi
set $data
while [ -n "$1" ]
do
t=$1
in=$2
out=$3
shift 3
# l=${l#*: }
# in=${l%% *}
ine=${in##*e+}
inm=${in%%e*}
inm=$(echo "$inm * 10 ^ $ine" | bc)
in=${inm%%.*}
(( sin += in ))
oute=${out##*e+}
outm=${out%%e*}
outm=$(echo "$outm * 10 ^ $oute" | bc)
out=${outm%%.*}
(( sout += out ))
#echo $sin $sout
done
#echo $sin $sout
((ain = sin * 8 / 5 ))
((aout = sout * 8 / 5 ))
((pin = ain * 100 / lw))
((pout = aout * 100 / lw))
if [ $pin -ge $c -o $pout -ge $c ]
then
echo "CRITICAL: RX: $pin% TX: $pout%"
exit $CRITICAL
fi
if [ $pin -ge $w -o $pout -ge $w ]
then
echo "WARNING: RX: $pin% TX: $pout%"
exit $WARNING
fi
echo "OK: RX: $pin% TX: $pout%"
exit $OK
I think I could make this into a nice project in http://exchange.nagios.org/
Rob Hassing
Re: Plugin that detects network overload
@turboscrew: I think rhassing has the best suggestion so far. Let us know what you think and we can move forward with more specific assistance.
Former Nagios employee