check_rrdtraf

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
SDK
Posts: 45
Joined: Wed Mar 21, 2012 4:23 pm

Re: check_rrdtraf

Post by SDK »

abrist wrote:You may want to check your perfdata and npcd logs for load/timeout warnings:

Code: Select all

tail -25 /usr/local/nagios/var/perfdata.log
tail -25 /usr/local/nagios/var/npcd.log
Hi Abrist,

there weren't any timeouts. I splitted up the mrtg.cfg and run multiple instances of it. Since then i don't have these issues in any of my graphs anymore.

Kind regards

Dominik
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: check_rrdtraf

Post by abrist »

Looks like it was a file lock issue then. This is good to know as others with large installs may experience the same problems. There was some chat around office about modernizing the mrtg approach in XI. We will see what happens.

It may be worthwhile to open up a bug report about this at: tracker.nagios.com
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
SDK
Posts: 45
Joined: Wed Mar 21, 2012 4:23 pm

Re: check_rrdtraf

Post by SDK »

abrist wrote:Looks like it was a file lock issue then. This is good to know as others with large installs may experience the same problems. There was some chat around office about modernizing the mrtg approach in XI. We will see what happens.

It may be worthwhile to open up a bug report about this at: tracker.nagios.com
Yeah it definitely was the problem with the previous MRTG not finishing in the 5 minute time frame, and the following MRTG instance not able to run because of the lock file. We have a 4 vCPU Setup with modern XEON Processors underneath it, so its not the slowest hardware.

With the split up of the config file and multiple instance's + "Forks:" Option its finishing now within 15-30 seconds.

I did like to give some input regarding the bandwidth monitoring approach of NagiosXI.

1. When using the switch wizard it creates a mrtg.cfg with all the ports it discovered. So even if you deselect port's in the Webfrontend because you don't want
them to monitor, MRTG does it nonetheless, this is sub optimal especially for big environment's

2. More of a general problem, doubled I/O's for bandwidth monitoring because MRTG is saving the results in its rrd files and then Nagios is checking these files and
saves them in its own rrd files

3. I have noticed that with the default NagiosXI install, MRTG is installed without the "Forks:" option in the mrtg.cfg. This may be difficult to tweak depending on
the hardware, but it speeds up MRTG in large environments for the polling jobs because it forks it self for this.

Kind regards

Dominik
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: check_rrdtraf

Post by sreinhardt »

1) I completely agree, if you choose not to monitor them there is no reason that we are collecting data.
2) I believe there is a reason for this, although I don't recall offhand. Possibly similar to your issue in this thread with file locking.
3) Nice to know, I will be sure to note this for our developer looking into the mrtg changes.

Thanks for the response! I would agree that all of these are valid points, and I know abrist has already sent an email but we will be sure to bring it up in the development meetings. If you have anything more to add, we would love to hear it!!
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
SDK
Posts: 45
Joined: Wed Mar 21, 2012 4:23 pm

Re: check_rrdtraf

Post by SDK »

sreinhardt wrote:1) I completely agree, if you choose not to monitor them there is no reason that we are collecting data.
2) I believe there is a reason for this, although I don't recall offhand. Possibly similar to your issue in this thread with file locking.
3) Nice to know, I will be sure to note this for our developer looking into the mrtg changes.

Thanks for the response! I would agree that all of these are valid points, and I know abrist has already sent an email but we will be sure to bring it up in the development meetings. If you have anything more to add, we would love to hear it!!
Hello sreinhardt,

thank you for the feedback!

I am not so sure about opening a bug ticket. For me it is more of a scalability issue rather than a bug. I am quit happy that this topic is on the Nagios XI developers radar and would love to see some innovations around that in the future.

Kind regards

Dominik
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: check_rrdtraf

Post by abrist »

I sent out an email with a few posts from this thread to our technical team. This will most likely not be quick change, if at all. But IMHO, your points are valid concerns for large installations.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check_rrdtraf

Post by scottwilkerson »

Actually, most of this is in process and should all be addressed in the next major release.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked