[root@lisl-ngos-01-pv conf.d]# time LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
SNMP Error:
no response received
SNMPv2c_Session (remote host: "<xxx.xxx.xxx.xxx>" [<xxx.xxx.xxx.xxx>].161)
community: "<SNMP STRING>"
request ID: 1011561030
PDU bufsize: 8000 bytes
timeout: 2s
retries: 5
backoff: 1)
at /usr/bin/../lib/mrtg2/SNMP_util.pm line 497
SNMPGET Problem for ifInOctets.1 ifOutOctets.1 on <SNMP STRING>@<xxx.xxx.xxx.xxx>:::::2:v4only
at /usr/bin/mrtg line 2330
2015-01-05 10:29:15: WARNING: skipping because at least the query for ifInOctets.1 on <xxx.xxx.xxx.xxx> did not succeed
2015-01-05 10:29:15: WARNING: no data for ifInOctets&ifOutOctets:<SNMP STRING>@<xxx.xxx.xxx.xxx>. Skipping further queries for Host <xxx.xxx.xxx.xxx> in this round.
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_1][_IN_] ' $target->[2255]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_1][_OUT_] ' $target->[2255]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_2][_IN_] ' $target->[2256]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_2][_OUT_] ' $target->[2256]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_3][_IN_] ' $target->[2257]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_3][_OUT_] ' $target->[2257]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_4][_IN_] ' $target->[2258]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_4][_OUT_] ' $target->[2258]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_5][_IN_] ' $target->[2259]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_5][_OUT_] ' $target->[2259]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_6][_IN_] ' $target->[2260]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_6][_OUT_] ' $target->[2260]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_7][_IN_] ' $target->[2261]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_7][_OUT_] ' $target->[2261]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_8][_IN_] ' $target->[2262]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_8][_OUT_] ' $target->[2262]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_9][_IN_] ' $target->[2263]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_9][_OUT_] ' $target->[2263]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_10][_IN_] ' $target->[2264]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_10][_OUT_] ' $target->[2264]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_12][_IN_] ' $target->[2265]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_12][_OUT_] ' $target->[2265]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_13][_IN_] ' $target->[2266]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_13][_OUT_] ' $target->[2266]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_14][_IN_] ' $target->[2267]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_14][_OUT_] ' $target->[2267]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_15][_IN_] ' $target->[2268]{$mode} ' did not eval into defined data
2015-01-05 10:34:30: ERROR: Target[<xxx.xxx.xxx.xxx>_15][_OUT_] ' $target->[2268]{$mode} ' did not eval into defined data
real 5m20.462s
user 0m9.069s
sys 0m1.869s
[root@lisl-ngos-01-pv conf.d]#
It's taking a little over 5 minutes for this job to run and there's only one device with errors (it's down atm, hence the errors). I can't for the lif of me figure out why it's taking so long. I've seen this run with 2-3 devices being down and still take less than 3 - 4 minutes. Any ideas on where else I can look? CPU usage/load is good right now (load is at ~6), and nothing else seems to be amiss. One thing I will point out is that the site where this server is located is currently experiencing pretty huge bandwidth utilization, but that's on the outside link. Everything on the LAN there is pretty normal.
Not accounting for latency or anything like that, you are looking at 4.6 minutes for that job to run through with all of those timeouts. Timeout is set to 2s for each check, with 5 retries, that would come out to roughly 10 seconds, multiplied by the number of interfaces with "non evals", that comes to 280 seconds, or roughly 4.6 minutes. That is just one cup of coffee math, but I believe that is what you are seeing here, I'm looking into the timeout and retries definitions but as far as I know, that is doing what I noted above.
I'm somewhat confused as to why this would be an issue today, though. This particular device has been down for roughly a month, probably a bit more and we haven't had this issue (at least not this consistently) until today. Also, as I mentioned, we've had situations where there have been 3-4 devices down (what if an entire site goes down, for example) and not had it take this long for the job to run, and all of our network devices are set to 5 retries. I assume the timeout is set by the plugin? If that's the case then that would be the same for all devices as well.
I should also note that I removed the offending cfg file and only saw about a 15 second improvement in the time. I removed the config (after making a backup, of course), ran that command, then restored the config from the backup file and ran the command again. The result, 2M 23S vs 2M 38S:
[root@lisl-ngos-01-pv conf.d]# time LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
real 2m23.411s
user 0m7.887s
sys 0m1.238s
[root@lisl-ngos-01-pv conf.d]#
[root@lisl-ngos-01-pv conf.d]#
[root@lisl-ngos-01-pv conf.d]# cp <IP ADDRESS>.cfg.bkp <IP ADDRESS>.cfg
[root@lisl-ngos-01-pv conf.d]# rm /var/lock/mrtg/mrtg_l -f
[root@lisl-ngos-01-pv conf.d]# time LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
SNMP Error:
no response received
SNMPv2c_Session (remote host: "<IP ADDRESS>" [<IP ADDRESS>].161)
community: "<SNMP COMMUNITY>"
request ID: 144962097
PDU bufsize: 8000 bytes
timeout: 2s
retries: 5
backoff: 1)
at /usr/bin/../lib/mrtg2/SNMP_util.pm line 497
SNMPGET Problem for ifInOctets.1 ifOutOctets.1 on <SNMP COMMUNITY>@<IP ADDRESS>:::::2:v4only
at /usr/bin/mrtg line 2330
2015-01-05 12:35:34: WARNING: skipping because at least the query for ifInOctets.1 on <IP ADDRESS> did not succeed
2015-01-05 12:35:34: WARNING: no data for ifInOctets&ifOutOctets:<SNMP COMMUNITY>@<IP ADDRESS>. Skipping further queries for Host <IP ADDRESS> in this round.
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_1][_IN_] ' $target->[2255]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_1][_OUT_] ' $target->[2255]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_2][_IN_] ' $target->[2256]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_2][_OUT_] ' $target->[2256]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_3][_IN_] ' $target->[2257]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_3][_OUT_] ' $target->[2257]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_4][_IN_] ' $target->[2258]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_4][_OUT_] ' $target->[2258]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_5][_IN_] ' $target->[2259]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_5][_OUT_] ' $target->[2259]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_6][_IN_] ' $target->[2260]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_6][_OUT_] ' $target->[2260]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_7][_IN_] ' $target->[2261]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_7][_OUT_] ' $target->[2261]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_8][_IN_] ' $target->[2262]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_8][_OUT_] ' $target->[2262]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_9][_IN_] ' $target->[2263]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_9][_OUT_] ' $target->[2263]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_10][_IN_] ' $target->[2264]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_10][_OUT_] ' $target->[2264]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_12][_IN_] ' $target->[2265]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_12][_OUT_] ' $target->[2265]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_13][_IN_] ' $target->[2266]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_13][_OUT_] ' $target->[2266]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_14][_IN_] ' $target->[2267]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_14][_OUT_] ' $target->[2267]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_15][_IN_] ' $target->[2268]{$mode} ' did not eval into defined data
2015-01-05 12:38:08: ERROR: Target[<IP ADDRESS>_15][_OUT_] ' $target->[2268]{$mode} ' did not eval into defined data
real 2m38.361s
user 0m8.240s
sys 0m1.219s
[root@lisl-ngos-01-pv conf.d]#
That looks quite a bit more normal, but I honestly can say that aside from network issues or really high load(you seem to keep that down pretty well), there isn't too much that alters mrtg behavior. It looks like this was an over the weekend\this morning issue, correct? Do you know if anything has changed or was having issues throughout the network lately?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
This was this morning starting from around 8 until about noonish. During that time we had nearly 100% BW usage to our main data center in Illinois. This data center also happens to be where the Nagios server is. I'm thinking that's probably the cause of it, but just wondering if there's anything I can do that would help if this happens again. Not sure if modifying the timeout time or any of that would help or not in this case.
I would tend to agree, mrtg is very network heavy and having that much usage would definitely mess with it a bit! Modifying timeout or retries(preferable over timeout) would be a good idea if this is a repeat thing as it would limit the overall delay purely from retries. Although in this case it does seem to be more related to high bandwidth and subsequently high latency and possibly not quite hitting timeout times.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.