MRTG consumes 100% of system resources

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
TBT
Posts: 625
Joined: Wed May 18, 2011 1:26 pm

Re: MRTG consumes 100% of system resources

Post by TBT »

scottwilkerson wrote:
TBT wrote:Manually running without User and Group was successful. Timestamp on the files (/var/lib/mrtg) now reflects when ran. Also, the mrtg.lock file is present.

Additionally, we've modified the cron job, removing User and Group, allowing it to run as per schedule. Result was also successful as graphs are updating.

We still don't understand why this affects only 1 of the 9 XI Servers in our environment. Should we modify the cron on all servers and will the User/Group be removed from future XI releases?
Glad to hear that removing that resolved the issue, but frankly I don't know why it did. The addition of the user/group to the cron to for a security vulnerability, although upgrading the Wizard to the latest may also mitigate that as well for future runs.

We will not be removing the user/group in the future, if the wizards is updated on all server I would say it is ok to change the cron on all of them.
So are you saying that if we remove the user/group from the cron it will re-introduce the security vulnerability?

I'd like to figure this issue out, as it will be reoccurring in future XI updates. Further suggestions?
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: MRTG consumes 100% of system resources

Post by scottwilkerson »

Lets look at a couple other permissions on the server with issues

Code: Select all

ls -l /etc/mrtg/mrtg.cfg
ls -l /usr/bin/mrtg
Also, could you verify the mrtg version on the problem server and one of the good ones

Code: Select all

LANG=C LC_ALL=C /usr/bin/mrtg|head -4
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
TBT
Posts: 625
Joined: Wed May 18, 2011 1:26 pm

Re: MRTG consumes 100% of system resources

Post by TBT »

ls -l /etc/mrtg/mrtg.cfg
-rwxrwxr-x 1 apache nagios 830 Dec 5 08:48 /etc/mrtg/mrtg.cfg

ls -l /usr/bin/mrtg
-rwxr-xr-x 1 root root 108862 Nov 8 2016 /usr/bin/mrtg

LANG=C LC_ALL=C /usr/bin/mrtg|head -4
mrtg-2.17.4 - Multi Router Traffic Grapher
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: MRTG consumes 100% of system resources

Post by scottwilkerson »

This all seems as expected.

Can I have you run the following on the troubled machine to attempt to re-apply the changed for mrtg from 5.5.7

Code: Select all

cd /tmp
wget https://assets.nagios.com/downloads/nagiosxi/xi-latest.tar.gz
tar xzf xi-latest.tar.gz
cd /tmp/nagiosxi/subcomponents/mrtg
./upgrade
This will re-add the user and group to /etc/cron.d/mrtg but also should make all the necessary permissions changes.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
TBT
Posts: 625
Joined: Wed May 18, 2011 1:26 pm

Re: MRTG consumes 100% of system resources

Post by TBT »

scottwilkerson wrote:This all seems as expected.

Can I have you run the following on the troubled machine to attempt to re-apply the changed for mrtg from 5.5.7

Code: Select all

cd /tmp
wget https://assets.nagios.com/downloads/nagiosxi/xi-latest.tar.gz
tar xzf xi-latest.tar.gz
cd /tmp/nagiosxi/subcomponents/mrtg
./upgrade
This will re-add the user and group to /etc/cron.d/mrtg but also should make all the necessary permissions changes.
Before I do that.... so it isn't related to the MRTG version, right?

We're excluding snmptt* mrtg* rrdtool* from Yum when applying OS updates, instead letting those packages update via the Nagios XI upgrade. This was done years ago across all 9 XI hosts, so yet again wouln't explain why only 1 is experiencing the issue and not the others.

Let me know and I'll proceed with running the upgrade as suggested.
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: MRTG consumes 100% of system resources

Post by scottwilkerson »

TBT wrote: Before I do that.... so it isn't related to the MRTG version, right?
No it isn't, you have the correct version provided by Nagios XI
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
TBT
Posts: 625
Joined: Wed May 18, 2011 1:26 pm

Re: MRTG consumes 100% of system resources

Post by TBT »

scottwilkerson wrote:
TBT wrote: Before I do that.... so it isn't related to the MRTG version, right?
No it isn't, you have the correct version provided by Nagios XI
Yet another discovery.

I ran the MRTG upgrade as instructed. Then allowed the cron to run, the high usage issue appeared to be resolved, however, the *.rrd files in /var/lib/mrtg were not updating.

We then noticed Nagios alerting about RRDtools path missing. Checking /etc/mrtg/mrtg.cfg, the LibAdd: /opt/rrdtool-1.4.4/lib/perl/5.10.1 line was removed AFTER running the upgrade. This line is present on our other XI hosts. Re-adding the line back into the config of course resolved this path issue. Perhaps you can evaluate why the upgrade script removed this?

Now that the path is back in the mrtg config, when cron runs, the high usages issue is once again introduced. By removing the User/Group from the cron, the high usage does not occur.

We're back at square one. Please advise.
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: MRTG consumes 100% of system resources

Post by scottwilkerson »

I wonder if it could be a permissions issue with the RRDs.pm in the library path you are specifying.

Can you show the output of:

Code: Select all

ls -al /opt/rrdtool-1.4.4/lib/perl/5.10.1
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
TBT
Posts: 625
Joined: Wed May 18, 2011 1:26 pm

Re: MRTG consumes 100% of system resources

Post by TBT »

scottwilkerson wrote:I wonder if it could be a permissions issue with the RRDs.pm in the library path you are specifying.

Can you show the output of:

Code: Select all

ls -al /opt/rrdtool-1.4.4/lib/perl/5.10.1
XI Host with issue.
total 20
drwxr-xr-x 3 root root 4096 Nov 20 2012 .
drwxr-xr-x 3 root root 4096 Nov 20 2012 ..
-r--r--r-- 1 root root 5497 Jul 5 2010 RRDp.pm
drwxr-xr-x 3 root root 4096 Nov 20 2012 x86_64-linux-thread-multi

XI Host without issue.
total 20
drwxr-xr-x 3 root root 4096 Nov 20 2012 .
drwxr-xr-x 3 root root 4096 Nov 20 2012 ..
-r--r--r-- 1 root root 5497 Jul 5 2010 RRDp.pm
drwxr-xr-x 3 root root 4096 Nov 20 2012 x86_64-linux-thread-multi
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: MRTG consumes 100% of system resources

Post by scottwilkerson »

I've went over and over this and cannot re-create the issue. Until I can do so I would recommend leaving the user/group off of the command in the mrtg cron and removing the write access for the apache user and nagios group to /etc/mrtg/mrtg.cfg as this is another way to remove possible exploitation of the vulnerability.

Code: Select all

chmod ug-w /etc/mrtg/mrtg.cfg
the only other thing I did note that the file in your lib directory is RRDp.pm instead of what is usually RRDs.pm (which is usually found in the default perl path and not necessary to add the AddLib directive to the config)
Can you run the following on both servers

Code: Select all

locate RRDs.pm

Code: Select all

rrdtool -v
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked