Page 1 of 3

IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 1:55 pm
by isadmin
I am receiving these errors in our logs....
IMUXSOCK begins to drop messages from PID due to rate-limiting
We are receiving warnings about checks also

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:06 pm
by lmiltchev
Have you tried following the steps, outlined in this post?

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:15 pm
by isadmin
There was a lot in that post wasnt sure if it pertained or was the same issue....
I dont have temp files building up or any messages about orphaned hosts

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:24 pm
by slansing
I believe it will still pertain to this issue since it is due to rate limiting. Have you tried bumping them up?

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:27 pm
by lmiltchev
Hm-m, I've seen similar errors, when temp files are building up...
http://support.nagios.com/forum/viewtop ... 16&t=11411

What is the output of the following command?

Code: Select all

ulimit -a
Are you having any issues with perfdata processing?

Code: Select all

ls /usr/local/nagios/var/spool/xidpe | wc -l
ls /usr/local/nagios/var/spool/perfdata | wc -l
ls /usr/local/nagios/var/spool/checkresults | wc -l

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:34 pm
by isadmin
just added here is the results but still receiving error

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 37599
max locked memory (kbytes, -l) 128
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 20480
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:35 pm
by isadmin
and
ls /usr/local/nagios/var/spool/xidpe | wc -l
2
ls /usr/local/nagios/var/spool/perfdata | wc -l
197089
ls /usr/local/nagios/var/spool/checkresults | wc -l
1172

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:53 pm
by lmiltchev
You have way too many files in the "/usr/local/nagios/var/spool/perfdata".
ls /usr/local/nagios/var/spool/perfdata | wc -l
197089
This means that npcd is (was not) running and processing these files. What is the load on the system? Maybe it exceeded the load_threshold, specified in the "/usr/local/nagios/etc/pnp/npcd.cfg" file... Check out the logs to see if you can find more info:

Code: Select all

tail -50 /usr/local/nagios/var/npcd.log
tail -50 /usr/local/nagios/var/perfdata.log
If you don't care about losing the old perfdata (whatever you have in the "/usr/local/nagios/var/spool/perfdata/" at the moment), you can clean up these files:

Code: Select all

cd /usr/local/nagios/var/spool
rm -rf perfdata
mkdir perfdata
chown nagios.nagios perfdata
chmod 755 perfdata

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:56 pm
by isadmin
yes says perf data load reached with a warning...deleting the files...

Re: IMUXSOCK Begins to drop messages

Posted: Wed Jan 22, 2014 2:58 pm
by isadmin
PCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1388684723.perfdata.host'
[01-22-2014 12:40:26] NPCD: ERROR: Executed command exits with return code '7'
[01-22-2014 12:40:26] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1388684731.perfdata.service'
[01-22-2014 12:40:26] NPCD: ERROR: Executed command exits with return code '7'
[01-22-2014 12:40:26] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1388684708.perfdata.host'
[01-22-2014 12:40:26] NPCD: WARN: MAX load reached: load 10.320000/10.000000 at i=181984[01-22-2014 12:40:54] NPCD: WARN: MAX load reached: load 10.390000/10.000000 at i=182149[01-22-2014 12:41:09] NPCD: WARN: MAX load reached: load 14.490000/10.000000 at i=182149[01-22-2014 12:41:24] NPCD: WARN: MAX load reached: load 14.040000/10.000000 at i=182149[01-22-2014 12:41:39] NPCD: WARN: MAX load reached: load 11.970000/10.000000 at i=182149[01-22-2014 12:43:09] NPCD: WARN: MAX load reached: load 11.040000/10.000000 at i=183124[01-22-2014 12:43:24] NPCD: WARN: MAX load reached: load 11.420000/10.000000 at i=183124[01-22-2014 12:43:44] NPCD: WARN: MAX load reached: load 11.490000/10.000000 at i=183139[01-22-2014 12:43:59] NPCD: WARN: MAX load reached: load 12.630000/10.000000 at i=183139[01-22-2014 12:44:14] NPCD: WARN: MAX load reached: load 12.270000/10.000000 at i=183139[01-22-2014 12:44:58] NPCD: ERROR: Executed command exits with return code '7'
[01-22-2014 12:44:58] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1388696033.perfdata.host'
[01-22-2014 12:44:58] NPCD: ERROR: Executed command exits with return code '7'
[01-22-2014 12:44:58] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b






014-01-22 13:20:02 [29911] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-01-22 13:20:02 [29913] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-01-22 13:20:02 [29911] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-01-22 13:20:02 [29913] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-22 13:20:02 [29911] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-22 13:20:05 [29913] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1388881528.perfdata.service-PID-29913 deleted
2014-01-22 13:20:05 [29911] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1388881528.perfdata.host-PID-29911 deleted
2014-01-22 13:20:05 [29913] [0] *** Timeout while processing Host: "vm-db3-prod" Service: "_oracle_flash_recovery_area_Free_Space"
2014-01-22 13:20:05 [29911] [0] *** Timeout while processing Host: "advid52" Service: "_HOST_"
2014-01-22 13:20:05 [29913] [0] *** process_perfdata.pl terminated on signal ALRM
2014-01-22 13:20:05 [29911] [0] *** process_perfdata.pl terminated on signal ALRM
2014-01-22 13:20:14 [30141] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-01-22 13:20:14 [30141] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-01-22 13:20:14 [30141] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-22 13:20:14 [30137] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-01-22 13:20:14 [30137] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-01-22 13:20:14 [30137] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-22 13:20:14 [30140] [0] *** TIMEOUT: Timeout after 5 secs. ***