Bandwidth Utilization Data Collection Intermittent
-
matt.lilek
- Posts: 137
- Joined: Wed Aug 07, 2013 11:53 am
Bandwidth Utilization Data Collection Intermittent
Hello Team,
Really not sure how to start with this. Had about half of our links showing 0MB of data before the first reboot in over a year. After the reboot almost all links now show 0. If i go into the performance graph i can see that sometimes it is collecting data but for the most part it is not. I have tried to remove and reconfigure the bandwidth on a few hosts but the issue remains the same. Please let me know what steps need to be done to resolve this issue.
Thank you,
Matt
Really not sure how to start with this. Had about half of our links showing 0MB of data before the first reboot in over a year. After the reboot almost all links now show 0. If i go into the performance graph i can see that sometimes it is collecting data but for the most part it is not. I have tried to remove and reconfigure the bandwidth on a few hosts but the issue remains the same. Please let me know what steps need to be done to resolve this issue.
Thank you,
Matt
You do not have the required permissions to view the files attached to this post.
Last edited by tgriep on Thu Apr 04, 2019 8:32 am, edited 1 time in total.
Reason: Profile removed and shared with the other Techs
Reason: Profile removed and shared with the other Techs
Re: Bandwidth Utilization Data Collection Intermittent
Can you run this command on the Nagios server and post the output here?
The MRTG process that gathers the Bandwidth data needs to finish within 5 minutes every time it runs and the above will show any errors and how long it takes to run.
If you see the command checking devices that are no longer active, remove the config file for it and that will help to speed up the process.
The config files are in the following folder and should be named by the IP address of the device.
/etc/mrtg/conf.d
Code: Select all
time LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfgIf you see the command checking devices that are no longer active, remove the config file for it and that will help to speed up the process.
The config files are in the following folder and should be named by the IP address of the device.
/etc/mrtg/conf.d
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
matt.lilek
- Posts: 137
- Joined: Wed Aug 07, 2013 11:53 am
Re: Bandwidth Utilization Data Collection Intermittent
Hey Tom,
Thanks for the reply, the output was quite long but basically what i put in the two screenshots. Its almost all of them that is like this, need a bulk solution to get them logging again.
Thanks in advance!
Thanks for the reply, the output was quite long but basically what i put in the two screenshots. Its almost all of them that is like this, need a bulk solution to get them logging again.
Thanks in advance!
You do not have the required permissions to view the files attached to this post.
Re: Bandwidth Utilization Data Collection Intermittent
One thing I needed is how long the command took to run. The time command at the beginning will tell you that information.
Can you post that information?
FYI, instead of screen capturing the data, most ssh terminals allow you to highlight the data and copy as text, you can do that to save time.
The errors that you did display, are those devices still active on your network?
If not, remove the MRTG configuration files to speed this up.
You can try this.
Edit the following file
Change this line from
to
Save the file and restart cron by running
Let the system run for 15 minutes and see if the bandwidth starts to graph again.
Can you post that information?
FYI, instead of screen capturing the data, most ssh terminals allow you to highlight the data and copy as text, you can do that to save time.
The errors that you did display, are those devices still active on your network?
If not, remove the MRTG configuration files to speed this up.
You can try this.
Edit the following file
Code: Select all
/etc/cron.d/mrtgCode: Select all
*/5 * * * * root LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg.lock --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagiosCode: Select all
*/5 * * * * root LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg.lock --confcache-file /var/lib/mrtg/mrtg.okCode: Select all
service crond restartBe sure to check out our Knowledgebase for helpful articles and solutions!
-
matt.lilek
- Posts: 137
- Joined: Wed Aug 07, 2013 11:53 am
Re: Bandwidth Utilization Data Collection Intermittent
Hey Tom,
Sorry was away for a bit on this, here are the times you were looking for
real 9m22.629s
user 0m10.225s
sys 0m0.677s
So changed the line and restarted and it actually killed a bunch of the incoming data from these routers ( the opposite of what we expected) Please let me know what the next step is, Ill be away for a bit but got a couple days now to get this sorted so let me know.
Thanks.
Matt
Sorry was away for a bit on this, here are the times you were looking for
real 9m22.629s
user 0m10.225s
sys 0m0.677s
So changed the line and restarted and it actually killed a bunch of the incoming data from these routers ( the opposite of what we expected) Please let me know what the next step is, Ill be away for a bit but got a couple days now to get this sorted so let me know.
Thanks.
Matt
Re: Bandwidth Utilization Data Collection Intermittent
Hello Matt,
What do you mean by " killed a bunch of the incoming data"?
The MRTG process has to finish within 5 minutes each time it runs so it can gather the data it needs to capture so the plugin can calculate the Bandwidth.
You can increase the number of forks that MRTG can spawn which will help speed up the process.
To do that, edit the /etc/mrtg/mrtg.cfg file and change this line from
to
Save it and that will allow 5 times more forks to gather the data.
And, what I suggested earlier, remove the MRTG config files from the /etc/mrtg/conf.d folder for devices that no longer exist and that will speed up the process as well.
What do you mean by " killed a bunch of the incoming data"?
The MRTG process has to finish within 5 minutes each time it runs so it can gather the data it needs to capture so the plugin can calculate the Bandwidth.
You can increase the number of forks that MRTG can spawn which will help speed up the process.
To do that, edit the /etc/mrtg/mrtg.cfg file and change this line from
Code: Select all
Forks: 4Code: Select all
Forks: 20And, what I suggested earlier, remove the MRTG config files from the /etc/mrtg/conf.d folder for devices that no longer exist and that will speed up the process as well.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
matt.lilek
- Posts: 137
- Joined: Wed Aug 07, 2013 11:53 am
Re: Bandwidth Utilization Data Collection Intermittent
Hey Tom,
Done that now. When i said it killed the data coming in i meant data stopped coming in for a whole smash of the instances where they were ok just prior to me making that change. There are 100s of configs in there, how can i easily determine and remove any none existing one?
Done that now. When i said it killed the data coming in i meant data stopped coming in for a whole smash of the instances where they were ok just prior to me making that change. There are 100s of configs in there, how can i easily determine and remove any none existing one?
-
matt.lilek
- Posts: 137
- Joined: Wed Aug 07, 2013 11:53 am
Re: Bandwidth Utilization Data Collection Intermittent
real 3m28.139s
user 0m11.300s
sys 0m1.354s
are the new times btw
user 0m11.300s
sys 0m1.354s
are the new times btw
Re: Bandwidth Utilization Data Collection Intermittent
What could of happened is before the change, the MRTG process ran as the nagios user and permissions caused some of the checks from running.
After the change, is runs a root and the configs could be read and the extended time it took to run, caused the new issue.
There is not a quick way to determine which config file can be removed.
They are named by the IP Address of the device.
If you have a list of known devices, you could use that to determine which ones to remove.
Or you can Ping the IP address or do a quick snmpwalk of the device to see if it responds.
After the change, is runs a root and the configs could be read and the extended time it took to run, caused the new issue.
There is not a quick way to determine which config file can be removed.
They are named by the IP Address of the device.
If you have a list of known devices, you could use that to determine which ones to remove.
Or you can Ping the IP address or do a quick snmpwalk of the device to see if it responds.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
matt.lilek
- Posts: 137
- Joined: Wed Aug 07, 2013 11:53 am
Re: Bandwidth Utilization Data Collection Intermittent
Hey Tom,
Things are looking much better now after increasing the forks. As for cleanup, maybe i can go through the hundreds that are in there one day (sooner if i have anymore issues) but for now think I am good so thanks for that. you can go ahead and wrap this one up!
Things are looking much better now after increasing the forks. As for cleanup, maybe i can go through the hundreds that are in there one day (sooner if i have anymore issues) but for now think I am good so thanks for that. you can go ahead and wrap this one up!