service check timeouts

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: service check timeouts

Post by quental »

Hi,

the IP in server is statically.

there isn't any process nrpe in backgruond mode running. and we only have 1 service xinetd running.

I attack config file nrpe.cfg
You do not have the required permissions to view the files attached to this post.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: service check timeouts

Post by rkennedy »

In your nrpe.cfg, I noticed the allowed_hosts=127.0.0.1 can you add your Nagios servers IP after this comma delimited?

Example -

Code: Select all

allowed_hosts=127.0.0.1,12.34.56.78
Replace 12.34.56.78 with your Nagios servers IP.

Additionally, can you run the following command and post the output -

Code: Select all

cat /etc/xinetd.d/nrpe|grep only_from
Former Nagios Employee
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: service check timeouts

Post by quental »

Hi,
We added the Nagios server IP in our nrpe.cfg
This is the result of the command:

cat /etc/xinetd.d/nrpe|grep only_from

only_from = 127.0.0.1 192.168.247.128
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: service check timeouts

Post by rkennedy »

Did you restart the NRPE service after? Did it start working?

If it's running under xinet.d -

Code: Select all

service xinet.d restart
Former Nagios Employee
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: service check timeouts

Post by quental »

Yes, we restart the NRPE service after the change. It´s running under xinetd.
The problem happened again after the change.
Some days the checks running well, but another days fails.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: service check timeouts

Post by tgriep »

Can you run the following on the system that is having problems and post back the results?

Code: Select all

ulimit -a
su nagios
ulimit -a
Thanks
Be sure to check out our Knowledgebase for helpful articles and solutions!
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: service check timeouts

Post by quental »

This is the result of the command:

address space limit (kbytes) (-M) unlimited
core file size (blocks) (-c) 0
cpu time (seconds) (-t) unlimited
data size (kbytes) (-d) unlimited
file size (blocks) (-f) unlimited
locks (-L) unlimited
locked address space (kbytes) (-l) 32
nice (-e) 0
nofile (-n) 4096
nproc (-u) 773390
pipe buffer size (bytes) (-p) 4096
resident set size (kbytes) (-m) unlimited
rtprio (-r) 0
socket buffer size (bytes) (-b) 4096
stack size (kbytes) (-s) 10240
threads (-T) not supported
process size (kbytes) (-v) unlimited
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: service check timeouts

Post by tgriep »

Can you post the following files so we can review them?

Code: Select all

/etc/security/limits.conf
And the contents of this folder

Code: Select all

/etc/security/limits.d
Next time you get this error, can you check the log files in /var/log to see if there are any other errors that could be helpful?
Thanks
Be sure to check out our Knowledgebase for helpful articles and solutions!
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: service check timeouts

Post by quental »

Code: Select all

ets@prometeo:/etc/security # cat limits.conf 
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - an user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open files
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to
#        - rtprio - max realtime priority
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#@student        -       maxlogins       4
*       soft    nofile  4096
*       hard    nofile  4096
# End of file

Code: Select all

ets@prometeo:/etc/security/limits.d # ls -l
total 0
Ok, next time i'll check the log files in /var/log
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: service check timeouts

Post by tgriep »

So far everything looks good. Hard to find something that fails intermittent.
The only thing else I can thing of is that the connections to the NRPE agent are not getting dropped and that could cause it to fail.
Can you run the following and post back here.

Code: Select all

netstat -an
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked