Page 1 of 1
Plugin that detects network overload
Posted: Fri Aug 29, 2014 11:42 pm
by turboscrew
Any ideas for a network interface traffic monitoring plugin that could detect if a NIC is overloaded?
That is: too many heavy networking programs running. I guess that implies some known time over which the traffic
is calculated so that short irregular peaks don't cause notifications.
I've be been checking out some plugins, but nothing is mentioned about time periods.
If a graphics template is found too, it would be really cool: After alarm, you could have a look into the load statistics..
Re: Plugin that detects network overload
Posted: Tue Sep 02, 2014 4:16 pm
by sreinhardt
I don't think anything like this currently exists. You might check out bischeck, as it does store data from repeat checks and compares the differences between them, warning on too high of a delta between checks. Otherwise I think this is something that is a great idea, but would likely need to be developed.
Re: Plugin that detects network overload
Posted: Tue Sep 02, 2014 4:19 pm
by tmcdonald
Re: Plugin that detects network overload
Posted: Thu Nov 06, 2014 10:44 am
by rhassing
There is a little problem with bandwidth monitoring and using snmp for it. You would only get an average of the used bandwith as you would only poll the device once every 5 mins.
If your switches /devices support it, it would be better to use sFlow to collect your bandwidth statistics. Normally this can be done with an sFlow collector like sFlowtool. (
http://www.inmon.com/bin/sflowtool-3.32.tar.gz)
This is a daemon which collects sFlow bandwidth data.
With the following script you could put this info in RRD files:
Code: Select all
#! /usr/bin/perl
# Copyright (c) 2001 InMon Corp. Licensed under the terms of the InMon sFlow licence:
# http://www.inmon.com/technology/sflowlicense.txt
#makes things work when run without install
use lib qw( ../perl-shared/blib/lib ../perl-shared/blib/arch );
#makes programm work AFTER install
use lib qw( /usr/local/rrdtool-1.0.33/lib/perl ../lib/perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/auto/RRDs );
use vars qw(@ISA $loaded);
use RRDs;
$dataDir="/var/spool/sflow/rrd";
$dataDir2="/var/spool/sflow/discards";
while(<>) {
($key, $value) = split /[\t =]+/, $_;
chomp $key;
chomp $value;
if($key eq "unixSecondsUTC") {$t = $value;}
if($key eq "agent") {$agent = $value;}
if($key eq "ifIndex") {$ifIndex = $value;}
if($key eq "ifSpeed") {$ifSpeed = $value;}
if($key eq "ifInOctets") {$ifInOctets = $value;}
if($key eq "ifOutOctets") {
$ifOutOctets = $value;
$rrd = "$dataDir/$agent-$ifIndex.rrd";
if(! -e $rrd) {
RRDs::create ($rrd, "--start",$t-1, "--step",20,
"DS:bytesIn:COUNTER:120:0:10000000000",
"DS:bytesOut:COUNTER:120:0:10000000000",
"RRA:MIN:0.5:1:44640",
"RRA:MAX:0.5:1:44640",
"RRA:AVERAGE:0.5:3:525600");
$ERROR = RRDs::error;
die "$0: unable to create `$rrd': $ERROR\n" if $ERROR;
}
if ($agent!="0.0.0.0"){
RRDs::update $rrd, "$t:$ifInOctets:$ifOutOctets";
}
if ($ERROR = RRDs::error) {
die "$0: unable to update `$rrd': $ERROR\n";
}
}
if($key eq "unixSecondsUTC") {$t = $value;}
if($key eq "agent") {$agent = $value;}
if($key eq "ifIndex") {$ifIndex = $value;}
if($key eq "ifInErrors") {$ifInErrors = $value;}
if($key eq "ifOutErrors") {$ifOutErrors = $value;}
if($key eq "ifInDiscards") {$ifInDiscards = $value;}
if($key eq "ifOutDiscards") {
$ifOutDiscards = $value;
$rrd2 = "$dataDir2/$agent-$ifIndex-discards.rrd";
if(! -e $rrd2) {
RRDs::create ($rrd2, "--start",$t-1, "--step",20,
"DS:DicardsIn:COUNTER:120:0:10000000",
"DS:ErrorsIn:COUNTER:120:0:10000000",
"DS:DicardsOut:COUNTER:120:0:10000000",
"DS:ErrorsOut:COUNTER:120:0:10000000",
"RRA:AVERAGE:0.5:3:576");
$ERROR = RRDs::error;
die "$0: unable to create `$rrd2': $ERROR\n" if $ERROR;
}
if ($agent!="0.0.0.0"){
RRDs::update $rrd2, "$t:$ifInDiscards:$ifInErrors:$ifOutDiscards:$ifOutErrors";
}
if ($ERROR = RRDs::error) {
die "$0: unable to update `$rrd2': $ERROR\n";
}
}
}
Then you would have your graphs
The following script could be used to find out your utilization for the interfaces.
Code: Select all
OK=0
WARNING=1
CRITICAL=2
ERROR=4
ip=$1
port=$2
if [ -z "$ip" -o -z "$port" ] || [ -n "$c" -a -z "$w" ]
then
echo usage: "$0 <ipaddress> <port number> [<linewidt> [<critical> <warning>]]"
exit $ERROR
fi
lw=$3
c=$4
w=$5
c=${c:-30}
w=${w:-25}
lw=${lw:=1G}
dir=/var/spool/sflow/rrd
case $lw in
1M)
lw=1000000
;;
10M)
lw=10000000
;;
100M)
lw=100000000
;;
1G)
lw=1000000000
;;
10G)
lw=10000000000
;;
100G)
lw=100000000000
;;
*)
echo linewidth value incorrect, mustby 1M, 10M, 100M, 1G, 10G or 100G
exit $ERROR
;;
esac
file=$dir/$ip-$port.rrd
if ! data=$(rrdtool fetch $file AVERAGE -s -360 -e -120 | tail -5)
then
echo Could not fetch rrd data from $file
exit $UNKNOWN
fi
set $data
while [ -n "$1" ]
do
t=$1
in=$2
out=$3
shift 3
# l=${l#*: }
# in=${l%% *}
ine=${in##*e+}
inm=${in%%e*}
inm=$(echo "$inm * 10 ^ $ine" | bc)
in=${inm%%.*}
(( sin += in ))
oute=${out##*e+}
outm=${out%%e*}
outm=$(echo "$outm * 10 ^ $oute" | bc)
out=${outm%%.*}
(( sout += out ))
#echo $sin $sout
done
#echo $sin $sout
((ain = sin * 8 / 5 ))
((aout = sout * 8 / 5 ))
((pin = ain * 100 / lw))
((pout = aout * 100 / lw))
if [ $pin -ge $c -o $pout -ge $c ]
then
echo "CRITICAL: RX: $pin% TX: $pout%"
exit $CRITICAL
fi
if [ $pin -ge $w -o $pout -ge $w ]
then
echo "WARNING: RX: $pin% TX: $pout%"
exit $WARNING
fi
echo "OK: RX: $pin% TX: $pout%"
exit $OK
In this script you could adjust the time in which you would like to calculate your averages.
I think I could make this into a nice project in
http://exchange.nagios.org/ 
Re: Plugin that detects network overload
Posted: Thu Nov 06, 2014 5:50 pm
by tmcdonald
@turboscrew: I think rhassing has the best suggestion so far. Let us know what you think and we can move forward with more specific assistance.