Bandwidth charts show 0/0 for us as well
Bandwidth charts show 0/0 for us as well
Following very closely with this topic:
https://support.nagios.com/forum/viewto ... 16&t=51709
Ping and status graphs are working, however bandwidth graphs do not. The behaviour is inconsistent. Seems significantly related to running MRTG as the nagios user via command line arguments, but the issue can occur as root also, especially if a check is missed.
Do not want to run MRTG as root due to security risks.
General Info:
- SNMPv3 queries
- Currently only running one device for testing
- Running MRTG as root seems to return interface traffic counters
- Running MRTG as nagios user seems to return interface traffic counters
- These do not always get written into the RRD file for reasons I cannot ascertain, despite significant investigations as to why
- Running MRTG as root with the --user=nagios --group=nagios arguments fails to connect to SNMP devices to retrieve data, for reasons I cannot ascertain
- Results in Bandwidth charts showing 0/0 or sometimes data for about 15 minutes before dropping off
- snmpwalk returns requested values without issues. I can see nagios gets them also when run as root. Even when these values are sent to rrd, sometimes RRD just does not store them...dont know why. Gut feel it has something to do with the 64 bit values in SNMPv3. Not sure why it is intermittent though.
Actions performed:
https://support.nagios.com/kb/print-29.html
(Documentation issue - sections of this such as setting permissions on /var/lib/mrtg actually break things in running environments. Please review.)
chown "apache:nagios" /etc/mrtg -R
chmod 775 /etc/mrtg -R
chown "apache:nagios" /var/lib/mrtg -R
chmod 775 /var/lib/mrtg -R
[root@nagiosxi etc]# yum list installed | grep -i rrd
rrdtool.x86_64 1.3.8-10.el6 @cr
rrdtool-perl.x86_64 1.3.8-10.el6 @cr
rrdtool-python.x86_64 1.3.8-10.el6 @cr
[root@nagiosxi etc]# yum list installed | grep -i snmp
net-snmp.x86_64 1:5.5-60.el6 @cr
net-snmp-devel.x86_64 1:5.5-60.el6 @cr
net-snmp-libs.x86_64 1:5.5-60.el6 @cr
net-snmp-perl.x86_64 1:5.5-60.el6 @cr
net-snmp-utils.x86_64 1:5.5-60.el6 @cr
perl-Net-SNMP.noarch 5.2.0-4.el6 @epel
perl-SNMP_Session.noarch 1.12-4.el6 @base
php-snmp.x86_64 5.3.3-49.el6 @cr
snmptt.noarch 1.4-0.9.beta2.el6 @epel
[root@nagiosxi etc]# yum list installed | grep -i mrtg
mrtg-libs.x86_64 2.16.2-9.el6 @base
[root@nagiosxi ~]# cpan -l | grep -i rrd
Unknown option: l
Nothing to install!
[root@nagiosxi ~]# cpan -l | grep -i snmp
Unknown option: l
Nothing to install!
cd /tmp
rm -rf nagiosxi xi*.tar.gz
wget http://assets.nagios.com/downloads/nagi ... est.tar.gz
tar xzf xi-latest.tar.gz
cd /tmp/nagiosxi/subcomponents/mrtg/
tar xzf mrtg*.tar.gz
cd mrtg*
./configure --prefix='/usr'
make all
make install
[root@nagiosxi ~]# LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg -debug=cfg,base,log &> /tmp/mrtg.txt
[root@nagiosxi ~]# LANG=C LC_ALL=C /usr/bin/mrtg &>> /tmp/mrtg.txt
[root@nagiosxi ~]# LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg -debug=cfg,base,log --user=nagios --group=nagios &> /tmp/mrtg.txt
[root@nagiosxi ~]# chown nagios /tmp/mrtg.txt
[root@nagiosxi ~]# su - nagios
[nagios@nagiosxi ~]$ LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg -debug=cfg,base,log &>> /tmp/mrtg.txt
(the command with --user=nagios --group=nagios takes notably longer to complete)
I find the line:
--log: got: ???/???
Very interesting, often when it works, I get values for this in the output?
Also in the second run, you can see when specifying --user=nagios and --group=nagios, we get undef! values for the counters, something is broken with SNMP when using these arguments. However in the last command, running as the nagios user, again the counters are collected. This is very repeatable behaviour.
I have removed all config files and added a new config with only one interface. No improvement.
I have deleted the .rrd files for the only remaining host. No improvement.
I have tried increasing the default port speed to a very large number. No improvement.
I am struggling to know what to do next...
https://support.nagios.com/forum/viewto ... 16&t=51709
Ping and status graphs are working, however bandwidth graphs do not. The behaviour is inconsistent. Seems significantly related to running MRTG as the nagios user via command line arguments, but the issue can occur as root also, especially if a check is missed.
Do not want to run MRTG as root due to security risks.
General Info:
- SNMPv3 queries
- Currently only running one device for testing
- Running MRTG as root seems to return interface traffic counters
- Running MRTG as nagios user seems to return interface traffic counters
- These do not always get written into the RRD file for reasons I cannot ascertain, despite significant investigations as to why
- Running MRTG as root with the --user=nagios --group=nagios arguments fails to connect to SNMP devices to retrieve data, for reasons I cannot ascertain
- Results in Bandwidth charts showing 0/0 or sometimes data for about 15 minutes before dropping off
- snmpwalk returns requested values without issues. I can see nagios gets them also when run as root. Even when these values are sent to rrd, sometimes RRD just does not store them...dont know why. Gut feel it has something to do with the 64 bit values in SNMPv3. Not sure why it is intermittent though.
Actions performed:
https://support.nagios.com/kb/print-29.html
(Documentation issue - sections of this such as setting permissions on /var/lib/mrtg actually break things in running environments. Please review.)
chown "apache:nagios" /etc/mrtg -R
chmod 775 /etc/mrtg -R
chown "apache:nagios" /var/lib/mrtg -R
chmod 775 /var/lib/mrtg -R
[root@nagiosxi etc]# yum list installed | grep -i rrd
rrdtool.x86_64 1.3.8-10.el6 @cr
rrdtool-perl.x86_64 1.3.8-10.el6 @cr
rrdtool-python.x86_64 1.3.8-10.el6 @cr
[root@nagiosxi etc]# yum list installed | grep -i snmp
net-snmp.x86_64 1:5.5-60.el6 @cr
net-snmp-devel.x86_64 1:5.5-60.el6 @cr
net-snmp-libs.x86_64 1:5.5-60.el6 @cr
net-snmp-perl.x86_64 1:5.5-60.el6 @cr
net-snmp-utils.x86_64 1:5.5-60.el6 @cr
perl-Net-SNMP.noarch 5.2.0-4.el6 @epel
perl-SNMP_Session.noarch 1.12-4.el6 @base
php-snmp.x86_64 5.3.3-49.el6 @cr
snmptt.noarch 1.4-0.9.beta2.el6 @epel
[root@nagiosxi etc]# yum list installed | grep -i mrtg
mrtg-libs.x86_64 2.16.2-9.el6 @base
[root@nagiosxi ~]# cpan -l | grep -i rrd
Unknown option: l
Nothing to install!
[root@nagiosxi ~]# cpan -l | grep -i snmp
Unknown option: l
Nothing to install!
cd /tmp
rm -rf nagiosxi xi*.tar.gz
wget http://assets.nagios.com/downloads/nagi ... est.tar.gz
tar xzf xi-latest.tar.gz
cd /tmp/nagiosxi/subcomponents/mrtg/
tar xzf mrtg*.tar.gz
cd mrtg*
./configure --prefix='/usr'
make all
make install
[root@nagiosxi ~]# LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg -debug=cfg,base,log &> /tmp/mrtg.txt
[root@nagiosxi ~]# LANG=C LC_ALL=C /usr/bin/mrtg &>> /tmp/mrtg.txt
[root@nagiosxi ~]# LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg -debug=cfg,base,log --user=nagios --group=nagios &> /tmp/mrtg.txt
[root@nagiosxi ~]# chown nagios /tmp/mrtg.txt
[root@nagiosxi ~]# su - nagios
[nagios@nagiosxi ~]$ LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg -debug=cfg,base,log &>> /tmp/mrtg.txt
(the command with --user=nagios --group=nagios takes notably longer to complete)
I find the line:
--log: got: ???/???
Very interesting, often when it works, I get values for this in the output?
Also in the second run, you can see when specifying --user=nagios and --group=nagios, we get undef! values for the counters, something is broken with SNMP when using these arguments. However in the last command, running as the nagios user, again the counters are collected. This is very repeatable behaviour.
I have removed all config files and added a new config with only one interface. No improvement.
I have deleted the .rrd files for the only remaining host. No improvement.
I have tried increasing the default port speed to a very large number. No improvement.
I am struggling to know what to do next...
You do not have the required permissions to view the files attached to this post.
Re: Bandwidth charts show 0/0 for us as well
When using SNMPv3 with MRTG, it loads this file Net_SNMP_util.pm in the application so it can poll the remote device.
If that file and the folders it is in, cannot be accessed by the nagios user account, that could be why it fails when it is polling SNMPv3 devices while using the nagios user account.
Search the drive for that file and see if the permissions are set so the nagios user account can read it.
If that file and the folders it is in, cannot be accessed by the nagios user account, that could be why it fails when it is polling SNMPv3 devices while using the nagios user account.
Search the drive for that file and see if the permissions are set so the nagios user account can read it.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Bandwidth charts show 0/0 for us as well
Thanks for the update. It does seem to have read access:
Code: Select all
[root@nagiosxi ~]# ls -la /usr/lib64/mrtg2/Net_SNMP_util.pm
-rw-r--r-- 1 root root 65075 May 11 2016 /usr/lib64/mrtg2/Net_SNMP_util.pm
[root@nagiosxi ~]# ls -la /usr/lib/mrtg2/Net_SNMP_util.pm
-rw-r--r-- 1 root root 66466 Feb 7 16:14 /usr/lib/mrtg2/Net_SNMP_util.pmRe: Bandwidth charts show 0/0 for us as well
I actually also find this weird, because the issue does not occur when run as the nagios user/group natively, only when run as arguments to MRTG
Re: Bandwidth charts show 0/0 for us as well
Try running this command again and post the full output here. Hopefully it will show something.
Code: Select all
time LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg -debug=cfg,base,log --user=nagios --group=nagiosBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: Bandwidth charts show 0/0 for us as well
I did already do this in the previous file attachment but without the time data. Here it is again though, this time with the...uh...time.
Keep in mind running MRTG as root or with su to nagios, the command is basically instantly completed.
Keep in mind running MRTG as root or with su to nagios, the command is basically instantly completed.
You do not have the required permissions to view the files attached to this post.
Re: Bandwidth charts show 0/0 for us as well
The 10 minutes it takes to run the MRTG command is causing the issue.
I think it may be a version issue in the Net_SNMP_util.pm scripts.
Open up these 2 files and at the top, search for the version number and which ever of the 2 files are an older version, rename the file to Net_SNMP_util.pm.bak.
Then run the MRTG command with time and see if it runs quicker and actually finishes the polling of the devices.
I think it may be a version issue in the Net_SNMP_util.pm scripts.
Open up these 2 files and at the top, search for the version number and which ever of the 2 files are an older version, rename the file to Net_SNMP_util.pm.bak.
Code: Select all
/usr/lib64/mrtg2/Net_SNMP_util.pm
/usr/lib/mrtg2/Net_SNMP_util.pmBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: Bandwidth charts show 0/0 for us as well
It doesn't take 10 minutes, it takes 10 seconds.
The version of the files is indeed different, the oldest being the lib64 version. It is our $VERSION = v1.0.15. Renaming this file made no difference to the running of the command.
For posterity, I set that back and then tried renaming the lib version as well. It was our $VERSION = v1.0.20. Renaming this file broke SNMPv3 functionality, and it tried to fall back to SNMPv1 which obviously failed as well.
The version of the files is indeed different, the oldest being the lib64 version. It is our $VERSION = v1.0.15. Renaming this file made no difference to the running of the command.
For posterity, I set that back and then tried renaming the lib version as well. It was our $VERSION = v1.0.20. Renaming this file broke SNMPv3 functionality, and it tried to fall back to SNMPv1 which obviously failed as well.
Re: Bandwidth charts show 0/0 for us as well
Yep, 10 seconds is correct. Need to get my eyes checked. 8)
The Net_SNMP_util.pm file that is $VERSION = v1.0.15, just rename it so it does not get loaded by mistake by MRTG.
I am thinking that there is an incompatible Perl module on the system and it is not allowing SNMPv3 polling to function.
When you ran the cpan -l command, it failed on your system so lets see if we can get it to install a newer version.
Run this command to get in to the cpan shell
Then in cpan, run the following
Exit out of cpan and run the following to get the versions of the SNMP perl modules and post the output.
Also, run the following 2 commands and post the output.
Thanks
The Net_SNMP_util.pm file that is $VERSION = v1.0.15, just rename it so it does not get loaded by mistake by MRTG.
I am thinking that there is an incompatible Perl module on the system and it is not allowing SNMPv3 polling to function.
When you ran the cpan -l command, it failed on your system so lets see if we can get it to install a newer version.
Run this command to get in to the cpan shell
Code: Select all
perl -MCPAN -e shellCode: Select all
install CPAN
reload cpanCode: Select all
cpan -l | grep -i snmpCode: Select all
cpan --help
ls -l /var/lib
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Bandwidth charts show 0/0 for us as well
Thanks for the help thus far...
Are we ignoring that MRTG can pull the SNMP requests when run both as root, or as the nagios user? It just doesn't work when the nagios user is submitted as an argument to MRTG? I would have thought that it implies the subsequent components are okay...
Initially upon running your request updates were done, but it didn't change anything.
Are we ignoring that MRTG can pull the SNMP requests when run both as root, or as the nagios user? It just doesn't work when the nagios user is submitted as an argument to MRTG? I would have thought that it implies the subsequent components are okay...
Initially upon running your request updates were done, but it didn't change anything.
Code: Select all
[root@nagiosxi ~]# perl -MCPAN -e shell
Terminal does not support AddHistory.
To fix enter> install Term::ReadLine::Perl
cpan shell -- CPAN exploration and modules installation (v2.22)
Enter 'h' for help.
cpan[1]> install CPAN
CPAN: Storable loaded ok (v2.20)
Reading '/root/.cpan/Metadata'
Database was generated on Thu, 14 Feb 2019 23:17:06 GMT
CPAN is up to date (2.22).
cpan[2]> reload cpan
(CPAN__unchanged__v2.22)(CPAN::Author__unchanged__v5.5002)(CPAN::CacheMgr__unchanged__v5.5002)(CPAN::Complete__unchanged__v5.5001)(CPAN::Debug__unchanged__v5.5001)(CPAN::DeferredCode__unchanged__v5.50)(CPAN::Distribution__unchanged__v2.22)(CPAN::Distroprefs__unchanged__v6.0001)(CPAN::Distrostatus__unchanged__v5.5)(CPAN::Exception::RecursiveDependency.....v5.5001)(CPAN::Exception::yaml_not_installed..v5.5)(CPAN::FTP__unchanged__v5.5011)(CPAN::FTP::netrc__unchanged__v1.01)(CPAN::HandleConfig__unchanged__v5.5008)(CPAN::Index__unchanged__v2.12)(CPAN::InfoObj__unchanged__v5.5)(CPAN::LWP::UserAgent....v1.9601)(CPAN::Module__unchanged__v5.5003)(CPAN::Prompt__unchanged__v5.5)(CPAN::Queue__unchanged__v5.5002)(CPAN::Shell__unchanged__v5.5008)(CPAN::Tarzip__unchanged__v5.5012)(CPAN::Version__unchanged__v5.5003)
11 subroutines redefined
cpan[4]> exit
Terminal does not support GetHistory.
Lockfile removed.
[root@nagiosxi ~]# cpan -l
Unknown option: l
Nothing to install!
[root@nagiosxi ~]# cpan --help
/usr/bin/cpan version [unknown] calling Getopt::Std::getopts (version 1.06 [paranoid]),
running under Perl version 5.10.1.
Usage: cpan [-OPTIONS [-MORE_OPTIONS]] [--] [PROGRAM_ARG1 ...]
The following single-character options are accepted:
Boolean (without arguments): -h -v -C -A -D -O -L -a -r -c -f -i -m -t
Options may be merged together. -- stops processing of options.
For more details run
perldoc -F /usr/bin/cpan
[Now continuing due to backward compatibility and excessive paranoia.
See ``perldoc Getopt::Std'' about $Getopt::Std::STANDARD_HELP_VERSION.]
Nothing to install!
[root@nagiosxi ~]# ls -l /var/lib
total 188
drwxr-xr-x. 2 root root 4096 Jul 4 2018 alternatives
drwx------. 3 root root 4096 Mar 31 2015 authconfig
drwx------ 2 apache apache 4096 Jun 20 2018 dav
drwxr-xr-x 2 root root 4096 Jun 20 2018 dbus
drwxr-xr-x. 2 root root 4096 Jul 13 2018 dhclient
drwxr-xr-x. 2 root root 4096 Sep 23 2011 games
-rw-r--r-- 1 root root 1534 Feb 15 03:12 logrotate.status
drwxr-xr-x. 2 root root 4096 Sep 23 2011 misc
drwxr-x--- 2 root slocate 4096 Feb 15 03:12 mlocate
drwxrwxr-x 4 apache nagios 90112 Feb 5 14:06 mrtg
drwxr-xr-x 6 mysql mysql 4096 Feb 1 11:37 mysql
drwxr-xr-x 3 nagios nagios 4096 Feb 1 11:36 net-snmp
drwxr-xr-x 2 ntp ntp 4096 Feb 15 09:37 ntp
drwx------ 4 postgres postgres 4096 Nov 27 2017 pgsql
drwxr-xr-x 3 root root 4096 Mar 22 2017 php
drwxr-xr-x. 2 root root 4096 Mar 22 2017 plymouth
drwx------ 3 root root 4096 Mar 17 2015 polkit-1
drwx------. 2 postfix root 4096 Mar 24 2017 postfix
-rw-------. 1 root root 4096 Feb 1 11:37 random-seed
drwxr-xr-x. 2 root root 4096 Feb 1 11:40 rpm
drwx------. 2 root root 4096 Jun 20 2018 rsyslog
drwxr-x--- 2 shellinabox shellinabox 4096 Jul 10 2018 shellinabox
drwxr-xr-x. 4 root root 4096 Jun 20 2018 stateless
drwxr-xr-x. 3 root root 4096 Sep 7 2016 udev
drwxr-xr-x. 6 root root 4096 Aug 3 2018 yum
[root@nagiosxi ~]#