check_nt intermittent time outs
Posted: Thu Oct 09, 2014 9:37 am
Hi all,
We have one host which is monitored using different servicechecks.
Among those are windows services (which are checked using check_xi_service_nsclient (i.e. check_nt)
During an average day I get quite a few critical alerts reporting "CRITICAL - Socket timeout after 10 seconds" and getting an OK within one or two checks down the line.
I've thinkered with the "-t" parameter up to 60 seconds to no avail.
I've restarted the nsclient service a few times, dito.
I need to point out that this particular windows host is the only one at this site. The other hosts are VMS servers.
We're in the process of migrating to Nagios. The old monitoring tool (OpManager) is not reporting any of this behavior and inspection of the server shows nothing abnormal.
We use this host as a gateway between Nagios Eventlog and the VMS hosts. Those check are ok all the time.
I should think that there is something the matter with Nagios itself or at least with the check_nt plugin.
Here is an extract of the services config. I haven't listed all of them, just
STVMS4-Warning (100% OK)
Uptime (radom time outs)
I see nothing out of the ordinary.
XI is 2012R2.9
Core version is 3.5.0
We have one host which is monitored using different servicechecks.
Among those are windows services (which are checked using check_xi_service_nsclient (i.e. check_nt)
During an average day I get quite a few critical alerts reporting "CRITICAL - Socket timeout after 10 seconds" and getting an OK within one or two checks down the line.
I've thinkered with the "-t" parameter up to 60 seconds to no avail.
I've restarted the nsclient service a few times, dito.
I need to point out that this particular windows host is the only one at this site. The other hosts are VMS servers.
We're in the process of migrating to Nagios. The old monitoring tool (OpManager) is not reporting any of this behavior and inspection of the server shows nothing abnormal.
We use this host as a gateway between Nagios Eventlog and the VMS hosts. Those check are ok all the time.
I should think that there is something the matter with Nagios itself or at least with the check_nt plugin.
Here is an extract of the services config. I haven't listed all of them, just
STVMS4-Warning (100% OK)
Uptime (radom time outs)
Code: Select all
define service {
host_name staame-veeam01
service_description STVMS4-Warning
use xiwizard_windowseventlog_service
max_check_attempts 1
check_interval 1
retry_interval 1
check_period 24x7
notification_interval 1
notification_period standbyuren
notification_options w,
notifications_enabled 0
contacts Team 3
stalking_options o,w,c,u,
icon_image windowseventlog.png
_xiwizard windowseventlog
register 1
}
define service {
host_name staame-veeam01
service_description Uptime
use xiwizard_windowsserver_nsclient_service
check_command check_xi_service_nsclient!!UPTIME!!!!!!
max_check_attempts 5
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period standbyuren
notifications_enabled 0
contacts Team 1
_xiwizard windowsserver
register 1
}
XI is 2012R2.9
Core version is 3.5.0