Page 1 of 2
delay in service check
Posted: Fri Mar 04, 2016 10:56 am
by amit.ahuja
i have a service "File_Check" that check every minute whether a file exist or not on multiple servers. But i notice there's a delay in a check process. it doesn't check every min for some of the servers.
2016-03-04_10-51-45.png
Re: delay in service check
Posted: Fri Mar 04, 2016 11:27 am
by hsmith
Can we see the configuration for this service check? I imagine the check_interval may be wrong.
Re: delay in service check
Posted: Fri Mar 04, 2016 1:06 pm
by amit.ahuja
Code: Select all
define service {
host_name testbox
service_description File_Check
check_command check_nrpe!check_file_exists!-a /www/html/Keepalive.html!!!!!!
max_check_attempts 3
check_interval 1
retry_interval 2
check_period 24x7
notification_interval 15
contact_groups support
notification_period 24x7
notifications_enabled 0
notification_options w,c,r
_xiwizard nrpe
register 1
}
Re: delay in service check
Posted: Fri Mar 04, 2016 1:46 pm
by rkennedy
Can you please post the definition that relates to 'vews016'? This one appears to be fore textbox, and checking a different file.
Re: delay in service check
Posted: Fri Mar 04, 2016 2:14 pm
by amit.ahuja
It's the same configuration, just like the other hosts. some hosts are checking every minute, some are not.
Code: Select all
define service {
host_name vews016
service_description File_Check
check_command check_nrpe!check_file_exists!-a /macys.war/macyshc.html!!!!!!
max_check_attempts 3
check_interval 1
retry_interval 2
check_period 24x7
notification_interval 15
contact_groups support
notification_period 24x7
notifications_enabled 0
notification_options w,c,r
_xiwizard nrpe
register 1
Re: delay in service check
Posted: Fri Mar 04, 2016 2:17 pm
by hsmith
Can you please post your /usr/local/nagios/etc/nagios.cfg file here for review?
Re: delay in service check
Posted: Fri Mar 04, 2016 2:43 pm
by amit.ahuja
sure.
Re: delay in service check
Posted: Fri Mar 04, 2016 3:19 pm
by rkennedy
This looks fine as well. I wonder if something is going on in your system.
Can you PM over a profile? (Admin -> System Profile -> Download Profile)
EDIT: profile received
Re: delay in service check
Posted: Tue Mar 08, 2016 3:49 pm
by rkennedy
Just to confirm, do you have 32G of ram allocated to this machine?
I am seeing a few errors -
Code: Select all
Mar 4 08:48:04 MA100DLVMON812 nagios: wproc: 'Core Worker 21908' seems to be choked. ret = -1; bufsize = 117: errno = 11 (Resource temporarily unavailable)
Code: Select all
160303 7:30:07 [Warning] Disk is full writing './nagios/nagios_logentries.TMD' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
Additionally, at the top of your processes I saw this -
Code: Select all
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 24976 51.6 2.5 984824 844964 pts/2 S+ 16:49 0:07 vim nagios_logentries.MYD
root 17385 0.1 0.7 2897740 233684 ? Sl 2015 358:58 /opt/IBM/ITM/lx8266/lz/bin/klzagent
What is the output of
df -H?
Re: delay in service check
Posted: Tue Mar 08, 2016 4:42 pm
by amit.ahuja
yes i do have 32G allocated to this vm, i saw that /var was full and cleaned them. i also changed some performance setting and adjust reaper setting in nagios.cfg. It's working now.
Thanks