Page 2 of 3

Re: service check timeouts

Posted: Thu Nov 05, 2015 12:43 pm
by quental
Hi,

the IP in server is statically.

there isn't any process nrpe in backgruond mode running. and we only have 1 service xinetd running.

I attack config file nrpe.cfg

Re: service check timeouts

Posted: Thu Nov 05, 2015 3:37 pm
by rkennedy
In your nrpe.cfg, I noticed the allowed_hosts=127.0.0.1 can you add your Nagios servers IP after this comma delimited?

Example -

Code: Select all

allowed_hosts=127.0.0.1,12.34.56.78
Replace 12.34.56.78 with your Nagios servers IP.

Additionally, can you run the following command and post the output -

Code: Select all

cat /etc/xinetd.d/nrpe|grep only_from

Re: service check timeouts

Posted: Fri Nov 06, 2015 5:26 am
by quental
Hi,
We added the Nagios server IP in our nrpe.cfg
This is the result of the command:

cat /etc/xinetd.d/nrpe|grep only_from

only_from = 127.0.0.1 192.168.247.128

Re: service check timeouts

Posted: Fri Nov 06, 2015 10:48 am
by rkennedy
Did you restart the NRPE service after? Did it start working?

If it's running under xinet.d -

Code: Select all

service xinet.d restart

Re: service check timeouts

Posted: Mon Nov 16, 2015 5:50 am
by quental
Yes, we restart the NRPE service after the change. It´s running under xinetd.
The problem happened again after the change.
Some days the checks running well, but another days fails.

Re: service check timeouts

Posted: Mon Nov 16, 2015 12:11 pm
by tgriep
Can you run the following on the system that is having problems and post back the results?

Code: Select all

ulimit -a
su nagios
ulimit -a
Thanks

Re: service check timeouts

Posted: Tue Nov 17, 2015 10:33 am
by quental
This is the result of the command:

address space limit (kbytes) (-M) unlimited
core file size (blocks) (-c) 0
cpu time (seconds) (-t) unlimited
data size (kbytes) (-d) unlimited
file size (blocks) (-f) unlimited
locks (-L) unlimited
locked address space (kbytes) (-l) 32
nice (-e) 0
nofile (-n) 4096
nproc (-u) 773390
pipe buffer size (bytes) (-p) 4096
resident set size (kbytes) (-m) unlimited
rtprio (-r) 0
socket buffer size (bytes) (-b) 4096
stack size (kbytes) (-s) 10240
threads (-T) not supported
process size (kbytes) (-v) unlimited

Re: service check timeouts

Posted: Tue Nov 17, 2015 3:06 pm
by tgriep
Can you post the following files so we can review them?

Code: Select all

/etc/security/limits.conf
And the contents of this folder

Code: Select all

/etc/security/limits.d
Next time you get this error, can you check the log files in /var/log to see if there are any other errors that could be helpful?
Thanks

Re: service check timeouts

Posted: Wed Nov 18, 2015 9:06 am
by quental

Code: Select all

ets@prometeo:/etc/security # cat limits.conf 
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - an user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open files
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to
#        - rtprio - max realtime priority
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#@student        -       maxlogins       4
*       soft    nofile  4096
*       hard    nofile  4096
# End of file

Code: Select all

ets@prometeo:/etc/security/limits.d # ls -l
total 0
Ok, next time i'll check the log files in /var/log

Re: service check timeouts

Posted: Wed Nov 18, 2015 2:59 pm
by tgriep
So far everything looks good. Hard to find something that fails intermittent.
The only thing else I can thing of is that the connections to the NRPE agent are not getting dropped and that could cause it to fail.
Can you run the following and post back here.

Code: Select all

netstat -an