Error in perfdata.log

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Error in perfdata.log

Post by rajasegar »

Nagios XI 2012R2.9
RHEL 6.5 x64
Manual Install
Firefox 23

Please advice on how to fix this warning.
[06-08-2014 00:12:21] NPCD: WARN: MAX load reached: load 25.390000/10.000000 at i=1
[06-08-2014 00:12:36] NPCD: WARN: MAX load reached: load 19.840000/10.000000 at i=1
[06-08-2014 00:12:51] NPCD: WARN: MAX load reached: load 15.520000/10.000000 at i=1
[06-08-2014 00:13:06] NPCD: WARN: MAX load reached: load 12.310000/10.000000 at i=1
npcd.zip
You do not have the required permissions to view the files attached to this post.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Error in perfdata.log

Post by tmcdonald »

That error simply means that your load has gotten too high and as a result npcd has stopped processing data.

How many cores do you have? A good rule of thumb is to take the number of cores you have (really the number of threads) and multiply that by 10, then enter that value in:

Code: Select all

/usr/local/nagios/etc/pnp/npcd.cfg
for the "load_threshold" entry. Then restart npcd and it should stop displaying that error. However it would be preferable to cut down the load in the first place if possible.
Former Nagios employee
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Error in perfdata.log

Post by rajasegar »

tmcdonald wrote:That error simply means that your load has gotten too high and as a result npcd has stopped processing data.

How many cores do you have? A good rule of thumb is to take the number of cores you have (really the number of threads) and multiply that by 10, then enter that value in:

Code: Select all

/usr/local/nagios/etc/pnp/npcd.cfg
for the "load_threshold" entry. Then restart npcd and it should stop displaying that error. However it would be preferable to cut down the load in the first place if possible.
I have 8 cores, single thread VM. CPU utilization is almost always below 80%. It occasionally hovers momentarily around 50 - 60%.
Anyway changed load_threshold = 80.0 as recommended.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Error in perfdata.log

Post by Box293 »

There are a few things that can cause the load to get high.

One of these is when all the service checks run on the same interval (5 minutes for example). Every five minutes the Nagios XI host gets pretty busy.

If you haven't already done so, I suggest looking at the different service checks you have and justify the check intervals. For example disk space might only need to be checked evey 60 minutes. Also, instread of checking every 60 minutes, try 58 or 62 minutes. This just spreads the load out more.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Error in perfdata.log

Post by rajasegar »

Box293 wrote:There are a few things that can cause the load to get high.

One of these is when all the service checks run on the same interval (5 minutes for example). Every five minutes the Nagios XI host gets pretty busy.

If you haven't already done so, I suggest looking at the different service checks you have and justify the check intervals. For example disk space might only need to be checked evey 60 minutes. Also, instread of checking every 60 minutes, try 58 or 62 minutes. This just spreads the load out more.
I wish I could do that. Mine is almost all in 5 minute intervals. Even this they say is too long. :x
For those services check using Java especially for MQ etc, this was a big problem.
I solved it by using service dependencies. This makes the services check sequential. CPU usage on the client dropped from 90% to around 15%.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Error in perfdata.log

Post by rajasegar »

After making the changes, the load reduced considerably
In fact it reduced by half compared to before.
CPU usage does not seem to change much.
10-06-2014 07-30-30 PM.png
How do I find out where these errors are coming from?

[06-10-2014 09:21:59] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1402363297.perfdata.service'
[06-10-2014 10:38:29] NPCD: ERROR: Executed command exits with return code '7'
[06-10-2014 10:38:29] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1402367882.perfdata.host'
[06-10-2014 10:38:29] NPCD: ERROR: Executed command exits with return code '7'
[06-10-2014 10:38:29] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1402367882.perfdata.service'
[06-10-2014 11:13:27] NPCD: ERROR: Executed command exits with return code '7'
[06-10-2014 11:13:27] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1402369982.perfdata.service'
[06-10-2014 11:13:27] NPCD: ERROR: Executed command exits with return code '7'
[06-10-2014 11:13:27] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1402369997.perfdata.service'
You do not have the required permissions to view the files attached to this post.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Error in perfdata.log

Post by abrist »

Enable perfdata debug logging as specified in the FAQ:
http://support.nagios.com/wiki/index.ph ... leshooting
And then wait 5 minutes and post a tail of perfdata.log:

Code: Select all

tail -50 /usr/local/nagios/var/perfdata.log
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Error in perfdata.log

Post by rajasegar »

abrist wrote:Enable perfdata debug logging as specified in the FAQ:
http://support.nagios.com/wiki/index.ph ... leshooting
And then wait 5 minutes and post a tail of perfdata.log:

Code: Select all

tail -50 /usr/local/nagios/var/perfdata.log
Enabled the logging. So far have not seen any errors. Will post the error comes up.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Error in perfdata.log

Post by abrist »

Great, let us know if it recurs.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Error in perfdata.log

Post by rajasegar »

abrist wrote:Great, let us know if it recurs.
Found out the error is due to timeout.
Increased the timeout to 25 and did not see the errors anymore
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
Locked