NRPE Socket timeout after 10 seconds
Re: NRPE Socket timeout after 10 seconds
I would verify all of your definitions in place, because it's pretty apparent that adding the -t at the object level isn't being respected. Adding the -t 60 will fix it up, but it'll need to be at the proper place.
Former Nagios Employee
Re: NRPE Socket timeout after 10 seconds
I only have them in the commands file
Where else should I look
Where else should I look
Re: NRPE Socket timeout after 10 seconds
You can add the -t timeout in the check_nrpe command definition in the commands.cfg file to all of the checks that use the check_nrpe command will have it's timeout increased.
But, if you are still having problems, please post how the check_nrpe command is defined in the commands.cfg file as well as the service check and then we can go from there.
But, if you are still having problems, please post how the check_nrpe command is defined in the commands.cfg file as well as the service check and then we can go from there.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: NRPE Socket timeout after 10 seconds
I do have the -t on the commands
Sample of my services
Sometimes I see 60 seconds but for the most I see 10 second timeouts
This does happen on all my VM machines not on the Physical Hosts I have running Servers and Computers are fine just the VMs
Running VMWare ESXI Hosts 6.0
Thanks
Tom
Code: Select all
define command{
command_name check_nrpe
command_line /usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -t 60 -c $ARG1$ $ARG2$ $ARG3$ $ARG4$
}
define command{
command_name check_nrpe_test
command_line /usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -t 60 -c $ARG1$ $ARG2$ $ARG3$ $ARG4$ > /tmp/yourlog.txt
}
define command{
command_name check_mem
command_line $USER1$/check_mem.sh
}
define command{
command_name check_users
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -c $ARG1$ -a $ARG2$ $ARG3$
}
define command{
command_name check_windows_users
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -c check_users -a 2 3 "$_HOSTALLOWEDUSERS$"
}
define command{
command_name check_ms_win_updates
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -c check_ms_win_updates -a '-wd 15 -cd 30 -M PSWindowsUpdate'
}
define command{
command_name check_uptime
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -c CheckUpTime -a MaxCrit=90d
}
define command{
command_name cpu_load
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -c CheckCPU -a warn=80 crit=90 time=1m time=5m time=15m
}
define command{
command_name mem_check
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -c CheckMEM -a warn=80 crit=90 time=1m time=5m time=15m
}
define command{
command_name service_check
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -c CheckServiceState -a $ARG1$
}
Sample of my services
Code: Select all
define service {
host_name TGCS008
service_description Check Disk Usage J:
check_command check_nrpe!CheckDriveSize! -a ShowAll=long MinWarn=20% MinCrit=10% Drive=J: perf-unit=G
check_interval 1
servicegroups DriveSpace
use generic-service
}
define service {
host_name TGCS009
service_description Check Disk Usage
check_command check_nrpe!CheckDriveSize! -a ShowAll=long MinWarn=20% MinCrit=10% Drive=C: perf-unit=G
check_interval 1
servicegroups DriveSpace
use generic-service
}
define service {
host_name TGCS001
service_description Check OS Version
check_command check_nrpe!CheckWMI! -a "Query=Select Version,Caption from win32_OperatingSystem" columnSyntax="%value%" columnSeparator=", " ignore-perf-data
servicegroups OSVersion
check_interval 1
use generic-service
}
define service {
host_name TGCS002
service_description Check OS Version
check_command check_nrpe!Check_OS_Version! -a "perf-config=*(ignored:true)"
servicegroups OSVersion
check_interval 1
use generic-service
}
Sometimes I see 60 seconds but for the most I see 10 second timeouts
This does happen on all my VM machines not on the Physical Hosts I have running Servers and Computers are fine just the VMs
Running VMWare ESXI Hosts 6.0
Thanks
Tom
Re: NRPE Socket timeout after 10 seconds
Guys update
I get this too a lot
Check Application Event Logs UNKNOWN 02-20-2017 21:13:52 0d 0h 2m 14s 1/3 CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
again only on my VMS
Thoughts
I get this too a lot
Check Application Event Logs UNKNOWN 02-20-2017 21:13:52 0d 0h 2m 14s 1/3 CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
again only on my VMS
Thoughts
Re: NRPE Socket timeout after 10 seconds
The timeout settings for your commands look like they are set correctly to timeout at 60 seconds so I don't know where the 10 second setting is from unless one of the commands was missed.
You may want to check the nagios.cfg file and see if the default service timeout is set higher than 10 seconds.
This is the name of the object you should be looking at service_check_timeout
Take a look at the following document for NRPE Troubleshooting issues.
https://assets.nagios.com/downloads/nag ... utions.pdf
The Received 0 bytes from daemon message usually means that the service was not running on the remote server.
For Windows that means the NSClient agent was not running and for linux, the NRPE Agent was not running.
Take a look at the document for more details / causes.
You may want to check the nagios.cfg file and see if the default service timeout is set higher than 10 seconds.
This is the name of the object you should be looking at service_check_timeout
Take a look at the following document for NRPE Troubleshooting issues.
https://assets.nagios.com/downloads/nag ... utions.pdf
The Received 0 bytes from daemon message usually means that the service was not running on the remote server.
For Windows that means the NSClient agent was not running and for linux, the NRPE Agent was not running.
Take a look at the document for more details / causes.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: NRPE Socket timeout after 10 seconds
Thanks for the doc will review later.
One other thought I had since this is only happening on my VM machines after further checking it seem to happen during the backup window.
I am using Veeam B&R to backup y VM's
I was thinking of checking to time of monitoring on these machine to exclude the backup window. Also not all VMs have this issue.
I have tried this on two services but they still alert me
Any suggestions would be helpful
Thanks
Tom
One other thought I had since this is only happening on my VM machines after further checking it seem to happen during the backup window.
I am using Veeam B&R to backup y VM's
I was thinking of checking to time of monitoring on these machine to exclude the backup window. Also not all VMs have this issue.
I have tried this on two services but they still alert me
Any suggestions would be helpful
Thanks
Tom
Re: NRPE Socket timeout after 10 seconds
Setting up an exclude window would be a good solution for this.
To do this create a time period like the example below if you backup times runs between 1am and 3am, adjust it to your needs.
Then in your service check, you would define the following
That would exclude the check from running and also the notifications between the hours of 1am and 3am.
You could still leave the check_period to 24 x 7 so the service will still run but it will not send notifications during that time.
Either way should work for you.
To do this create a time period like the example below if you backup times runs between 1am and 3am, adjust it to your needs.
Code: Select all
define timeperiod {
timeperiod_name backup_time
alias backup_time
sunday 00:00-01:00,03:00-24:00
monday 00:00-01:00,03:00-24:00
tuesday 00:00-01:00,03:00-24:00
wednesday 00:00-01:00,03:00-24:00
thursday 00:00-01:00,03:00-24:00
friday 00:00-01:00,03:00-24:00
saturday 00:00-01:00,03:00-24:00
}
Code: Select all
check_period backup_time
notification_period backup_time
You could still leave the check_period to 24 x 7 so the service will still run but it will not send notifications during that time.
Either way should work for you.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: NRPE Socket timeout after 10 seconds
Thanks
I will apply that to all my VMS and will give it a few days
Will report back with results
Tom
I will apply that to all my VMS and will give it a few days
Will report back with results
Tom
Re: NRPE Socket timeout after 10 seconds
OK, let us know how it works out.
Be sure to check out our Knowledgebase for helpful articles and solutions!