How to scan specific OID's
- kyle.parker
- Posts: 42
- Joined: Mon Feb 03, 2014 4:07 pm
How to scan specific OID's
Hello,
I have taken the Network Switches / Routers wizard and customized it for our company to do a few extra things, such as pull the hostname and auto fill the field. set the community string to what we use by default and set the default warning and critical. This has worked great for us with Cisco devices, juniper devices, etc. We now want to scan and enter our Calix E7-20 GPON shelves, this works fine for shelves that are new or have very few customers. We run into a problem when the GPON shelf is full with 500+ customers as the wizard will scan each individual ONT and show it as a monitorable service. Obviously the wizard will time out and fail with this many nodes to scan.
What I want to do is create another customer wizard for scanning E7 devices. It would only scan the available OLT ports, which feeds our 1x32 splitters. The ultimate goal is to monitor bandwidth for each splitter so we know when the OLT ports are getting congested, I don't want to monitor individual customer ONT's. If I know for example card 1 port 1 is ifIndex.30101 and card 9 port 4 is ifIndex.30904, etc, can I modify the network switch/router wizard to only use .30000 through .40000. I am assuming somewhere in the wizard it just goes through .0 to .100000 for each OID, but I can't seem to find any line of code that confirms this theory.
Or if there is a way to increase some settings to allow for larger shelves to be scanned. Some of our E7 shelves have 500+ nodes that could theoretically be monitored.
Thanks,
I have taken the Network Switches / Routers wizard and customized it for our company to do a few extra things, such as pull the hostname and auto fill the field. set the community string to what we use by default and set the default warning and critical. This has worked great for us with Cisco devices, juniper devices, etc. We now want to scan and enter our Calix E7-20 GPON shelves, this works fine for shelves that are new or have very few customers. We run into a problem when the GPON shelf is full with 500+ customers as the wizard will scan each individual ONT and show it as a monitorable service. Obviously the wizard will time out and fail with this many nodes to scan.
What I want to do is create another customer wizard for scanning E7 devices. It would only scan the available OLT ports, which feeds our 1x32 splitters. The ultimate goal is to monitor bandwidth for each splitter so we know when the OLT ports are getting congested, I don't want to monitor individual customer ONT's. If I know for example card 1 port 1 is ifIndex.30101 and card 9 port 4 is ifIndex.30904, etc, can I modify the network switch/router wizard to only use .30000 through .40000. I am assuming somewhere in the wizard it just goes through .0 to .100000 for each OID, but I can't seem to find any line of code that confirms this theory.
Or if there is a way to increase some settings to allow for larger shelves to be scanned. Some of our E7 shelves have 500+ nodes that could theoretically be monitored.
Thanks,
Re: How to scan specific OID's
This is a bit outside of our support scope (modification of wizards or other code), but I can give some general advice:
- It's important to note that this wizard is using MRTG under the hood, so you may need to look into how that does the scanning as well
- You can change the PHP settings to get around the timeout, which might in the end be easier (and would keep your server in-scope for support)
- Please make sure you are doing this testing on a backed-up server, preferably not your production machine - modification of the software renders it unsupportable according to our support agreement
Former Nagios employee
- kyle.parker
- Posts: 42
- Joined: Mon Feb 03, 2014 4:07 pm
Re: How to scan specific OID's
I would prefer to increase the settings of the timeout values as I think that is the problem. The wizard, both the one provided by nagios 2.3.9 and the one I customized, just get stuck and I get a pop up stating that the window isn't responding. Would I like to continue waiting or close it. Like I said if I scan a shelf that only has 25 customers on it as it is newer, it is fine. It is when it starts scanning these heavily utilized shelves that the wizards just time out when clicking next after step 1.
- kyle.parker
- Posts: 42
- Joined: Mon Feb 03, 2014 4:07 pm
Re: How to scan specific OID's
I found this post as a similar issue. https://support.nagios.com/forum/viewto ... 16&t=45048
I had our server admin check the values of php.ini and they are already set to
max_execution_time = 60
max_input_time = 120
memory_limit = 512M
He did say that the max_input_vars=5000 was not in the file and that LimitRequestLine 100000 is not in /etc/httpd/conf/httpd.conf. Not sure if those would actually make a difference.
We did tail the log file while the wizard ran and we could see MRTG performing the walks, but they were taking 10-15 minutes to complete per OID.
Those are snippets from the log and my session actually timed out before it all finished. I don't understand why it is taking so long to get ifIndex or ifType, when a simple snmpwalk takes about a minute of those values.
As mentioned, this is an E7-20 that currently has 10 cards installed. Each GPON-8x card has 8 OLT ports. Each OLT connects to a 1x32 splitter and each splitter connects to a 716GE ONT, which has 4 GE ports. If we say that roughly 75% of our splitter ports are utilized this gives us 7680 interfaces it is trying to scan. This is why I would prefer to have the wizard scan just at the OLT level and not go as far as ONT's, but also what is MRTG doing that is so different from a regular old snmpwalk or get, as these run fine using the snmpwalk wizard (which I can't use as i need BW monitoring).
I had our server admin check the values of php.ini and they are already set to
max_execution_time = 60
max_input_time = 120
memory_limit = 512M
He did say that the max_input_vars=5000 was not in the file and that LimitRequestLine 100000 is not in /etc/httpd/conf/httpd.conf. Not sure if those would actually make a difference.
We did tail the log file while the wizard ran and we could see MRTG performing the walks, but they were taking 10-15 minutes to complete per OID.
Code: Select all
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
Use of uninitialized value $tempv in concatenation (.) or string at /usr/bin/../lib/mrtg2/SNMP_util.pm line 755.
--base: Get Interface Info
--base: Walking ifIndex
--snpd: [email protected]:161::::2 -> 123938 -> ifIndex = 123938
--snpd: [email protected]:161::::2 -> 123939 -> ifIndex = 123939
--snpd: [email protected]:161::::2 -> 123940 -> ifIndex = 123940
--snpd: [email protected]:161::::2 -> 123953 -> ifIndex = 123953
--snpd: [email protected]:161::::2 -> 123954 -> ifIndex = 123954
--snpd: [email protected]:161::::2 -> 123955 -> ifIndex = 123955
--snpd: [email protected]:161::::2 -> 123956 -> ifIndex = 123956
--snpd: [email protected]:161::::2 -> 123969 -> ifIndex = 123969
--snpd: [email protected]:161::::2 -> 123970 -> ifIndex = 123970
--snpd: [email protected]:161::::2 -> 123971 -> ifIndex = 123971
--snpd: [email protected]:161::::2 -> 123972 -> ifIndex = 123972
--base: Walking ifType
000 123811:1000000000 123812:1000000000 123825:1000000000 123826:1000000000 123827:1000000000 123828:1000000000 123841:1000000000 123842:1000000000 123843:1000000000 123844:1000000000 123857:1000000000 123858:1000000000 123859:1000000000 123860:1000000000 123873:1000000000 123874:1000000000 123875:1000000000 123876:1000000000 123889:1000000000 123890:1000000000 123891:1000000000 123892:1000000000 123905:1000000000 123906:1000000000 123907:1000000000 123908:1000000000 123921:1000000000 123922:1000000000 123923:1000000000 123924:1000000000 123937:1000000000 123938:1000000000 123939:1000000000 123940:1000000000 123953:1000000000 123954:1000000000 123955:1000000000 123956:1000000000 123969:1000000000 123970:1000000000 123971:1000000000 123972:1000000000
--base: Walking ifHighSpeed
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget [email protected]:161::::2:v4only for ifHighSpeed.123971 -> 1000 Mb/s
--base: snmpget [email protected]:161::::2:v4only for ifHCInOctets.123971 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget [email protected]:161::::2:v4only for ifHighSpeed.123972 -> 1000 Mb/s
--base: snmpget [email protected]:161::::2:v4only for ifHCInOctets.123972 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
As mentioned, this is an E7-20 that currently has 10 cards installed. Each GPON-8x card has 8 OLT ports. Each OLT connects to a 1x32 splitter and each splitter connects to a 716GE ONT, which has 4 GE ports. If we say that roughly 75% of our splitter ports are utilized this gives us 7680 interfaces it is trying to scan. This is why I would prefer to have the wizard scan just at the OLT level and not go as far as ONT's, but also what is MRTG doing that is so different from a regular old snmpwalk or get, as these run fine using the snmpwalk wizard (which I can't use as i need BW monitoring).
Re: How to scan specific OID's
The Network Switch / Router wizard uses the cfgmaker command from the MRTG package and the cfgmaker command does a lot more that read the OID's as it is creating the configuration files for MRTG and the wizard and that does take more time.
As well as they use Perl to gather the information and that takes more time as well.
With that many ports on the device, I thing you need to increase the settings in the php.ini and the httpd.conf file.
I would use these examples in the php.ini file
And in the httpd.conf file, change this from
to
Save the files and restart Nagios and Apache by running
Then see if the scanning of the Interfaces works for those devices.
As well as they use Perl to gather the information and that takes more time as well.
With that many ports on the device, I thing you need to increase the settings in the php.ini and the httpd.conf file.
I would use these examples in the php.ini file
Code: Select all
max_execution_time = 240
max_input_time = 480
memory_limit = 1024M
max_input_vars=50000And in the httpd.conf file, change this from
Code: Select all
LimitRequestLine 100000Code: Select all
LimitRequestLine 1000000Code: Select all
service httpd restart
service nagios restartBe sure to check out our Knowledgebase for helpful articles and solutions!
- kyle.parker
- Posts: 42
- Joined: Mon Feb 03, 2014 4:07 pm
Re: How to scan specific OID's
I set these values on a test server and although it did work this time without timing out the session it still takes 40+ minutes to complete. I have gone through the code for cfgmaker and found that even though you tell it don't scan admin down or oper down ports it will still scan them and just comment those targets out so that mrtg won't scan those indexes on a check. To me that seems extremely inefficient as you run into problems like this. 3 out of 4 ports are oper down on my 716GE ONT's, but the wizard will still scan them and collect all data on it anyways.
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: How to scan specific OID's
although there are not many, it appears issues with MRTG do get dealt with as they come in: https://github.com/oetiker/mrtg/issues? ... s%3Aclosed
I'm not sure if there's anything else we can do at this point, but I'm happy to put in a feature request to replace MRTG. If we were to do that, do you have any suggestions?
I'm not sure if there's anything else we can do at this point, but I'm happy to put in a feature request to replace MRTG. If we were to do that, do you have any suggestions?
- kyle.parker
- Posts: 42
- Joined: Mon Feb 03, 2014 4:07 pm
Re: How to scan specific OID's
At this point I don't think the problem is with mrtg, but the fact that cfgmaker needs to be updated. It hasn't been updated since at least 2012 as that is the last stable release of mrtg that I can find. If cfgmaker was updated to allow more operations, such as range values, or if it actually excluded interfaces from being scanned, rather than just comment them out based on template files, it would be much more useful and efficient.
I understand this is obviously out of your control as mrtg and cfgmaker is opensource third party software. Unless somebody on your team wanted to go ahead and revamp mrtg/cfgmaker to be more efficient.
I guess the other option is to do away with mrtg like you state, but I don't know what else is out there.
As these bigger and bigger switches and shelves come out the approach to scan everything first and parse later won't work. As an ISP we have some pretty large juniper devices also that take a while to scan and I know we aren't the only ones that use them.
Honestly I think the best solution is to update cfgmaker, but i went through that code almost all day and most of it makes no sense to me so I wouldn't even know where to start. I figured there was a for loop somewhere that I could have changed from "foreach" to "for 30000 - 34000" in my specific case, then make a custom wizard just for e7 gear that would call the customized cfgmaker script, but alas I couldn't find anywhere that made sense to do that. I am sure someone proficient in perl would understand it much more than myself and could even make it much more customizable from the cli with options.
I think the basic path cfgmaker takes is, scan the shelf to see what interfaces are available. Run walks with certain oid's on each interface. Then parse all that information to include or comment out what is needed or not needed.
I understand this is obviously out of your control as mrtg and cfgmaker is opensource third party software. Unless somebody on your team wanted to go ahead and revamp mrtg/cfgmaker to be more efficient.
I guess the other option is to do away with mrtg like you state, but I don't know what else is out there.
As these bigger and bigger switches and shelves come out the approach to scan everything first and parse later won't work. As an ISP we have some pretty large juniper devices also that take a while to scan and I know we aren't the only ones that use them.
Honestly I think the best solution is to update cfgmaker, but i went through that code almost all day and most of it makes no sense to me so I wouldn't even know where to start. I figured there was a for loop somewhere that I could have changed from "foreach" to "for 30000 - 34000" in my specific case, then make a custom wizard just for e7 gear that would call the customized cfgmaker script, but alas I couldn't find anywhere that made sense to do that. I am sure someone proficient in perl would understand it much more than myself and could even make it much more customizable from the cli with options.
I think the basic path cfgmaker takes is, scan the shelf to see what interfaces are available. Run walks with certain oid's on each interface. Then parse all that information to include or comment out what is needed or not needed.
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: How to scan specific OID's
I didn't either, but I found http://torrus.org/kyle.parker wrote:I guess the other option is to do away with mrtg like you state, but I don't know what else is out there.
What do you think about that as an addition or a replacement?
- kyle.parker
- Posts: 42
- Joined: Mon Feb 03, 2014 4:07 pm
Re: How to scan specific OID's
It certainly sounds like an alternative, but without trying it I couldn't tell you for sure if it is more efficient in the way it scans larger shelves. If your team tests it and determines it is a good alternative I don't see an issue with it as an addition.
I think that since we require something right now I will either have to use mrtg with the increased php timeout settings or completely customize a cfgmaker script for mrtg used specifically for Calix E7-20 chassis'.
I think that since we require something right now I will either have to use mrtg with the increased php timeout settings or completely customize a cfgmaker script for mrtg used specifically for Calix E7-20 chassis'.