Command checkveeambu didn't terminate within the timeout per

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Command checkveeambu didn't terminate within the timeout per

Post by kwhogster »

Using Nagios Core 4.3.4
I have two Windows servers 2012 R2 and 2016 both run Veeam B&R 10

I have a powershell script to check the backup jobs replication jobs and copy jobs.
The same script is on both servers.

On the 2016 server I am getting the following errors

Host: TGCS024
Service: Win 12 VM Backup
Status: CRITICAL
Last check: 03-25-2020 14:25:37
Duration: 0d 2h 49m 43s
Attempt: 10/10
Status information: CHECK_NRPE: Socket timeout after 120 seconds.

Host: TGCS024
Service: Linux Backup Copy Job
Status: UNKNOWN
Last check: 03-25-2020 14:25:31
Duration: 0d 2h 50m 53s
Attempt: 10/10
Status information: Command checkveeambu didn't terminate within the timeout period 60s


From the Nagios server I run from the command line

root@tgcs017:/usr/lib/nagios/plugins# ./check_nrpe -u -H TGCS024 -t 120 -c checkveeambu -a 'Linux VM backup' 1
CHECK_NRPE: Socket timeout after 120 seconds.
root@tgcs017:/usr/lib/nagios/plugins# ./check_nrpe -u -H TGCS024 -t 240 -c checkveeambu -a 'Linux VM backup' 1
Command checkveeambu didn't terminate within the timeout period 60s


From Nagios cfg file

Code: Select all

define service{
        use                     generic-service
        host_name               hostname
        service_description     Linux VM Backup
        check_interval          1440
        notification_interval   1440
        check_command           check_nrpe!checkveeambu! -a 'Linux VM Backup' 1
        servicegroups           Veeam
        }
From my nsclient

Code: Select all

check veeam backups
checkveeambu = cmd /c echo scripts/powershell/check_veeam_backups.ps1 "$ARG1$" "$ARG2$"; exit $LastExitCode | powershell.exe -command -

On the 2012R2 server all the checks work

When I run the script on the server directly it takes awhile to complete. It took 1min 52 seconds
PS C:\program files\nsclient++\scripts\powershell> .\check_veeam_backups.ps1 'linux vm backup' 1
Linux VM Backup Stopped 100% Success


Any thoughts or ideas?

Thank you

Tom
:roll:
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by Box293 »

There are two separate things going on here but they both relate to timing.
kwhogster wrote:root@tgcs017:/usr/lib/nagios/plugins# ./check_nrpe -u -H TGCS024 -t 120 -c checkveeambu -a 'Linux VM backup' 1
CHECK_NRPE: Socket timeout after 120 seconds.
kwhogster wrote:When I run the script on the server directly it takes awhile to complete. It took 1min 52 seconds
Basically a timeout of 120 is not enough, because additional overhead is taken to establish the connection and then start powershell on the remote system. I suspect if you set the timeout to 150 then this command would succeed.
kwhogster wrote:Status information: Command checkveeambu didn't terminate within the timeout period 60s
Nagios itself has a default global timeout of 60m seconds. If you want to wait for a check with a timeout of 150 then the global timeout should be a value greater than 150. Please refer to the following KB article, specifically the section Nagios XI Global Timeout.
https://support.nagios.com/kb/article/n ... s-617.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by kwhogster »

Box293

Thanks for the reply.

I thought the -T increase the timeout in one of my examples I had -T 240

Is there another setting? This is CORE not XI

Thanks

Tom

In my Nagios.cfg

Code: Select all

 TIMEOUT VALUES
# These options control how much time Nagios will allow various
# types of commands to execute before killing them off.  Options
# are available for controlling maximum time allotted for
# service checks, host checks, event handlers, notifications, the
# ocsp command, and performance data commands.  All values are in
# seconds.

service_check_timeout=120
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by kwhogster »

Update
I found NRPE.CFG

made this change

Code: Select all

# COMMAND TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# allow plugins to finish executing before killing them off.

#command_timeout=60
command_timeout=150



# CONNECTION TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# wait for a connection to be established before exiting. This is sometimes
# seen where a network problem stops the SSL being established even though
# all network sessions are connected. This causes the nrpe daemons to
# accumulate, eating system resources. Do not set this too low.

connection_timeout=300


restarted the nrpe service

sudo /etc/init.d/nagios-nrpe-server restart
[ ok ] Restarting nagios-nrpe-server (via systemctl): nagios-nrpe-server.service.

I tried this one first
root@tgcs017:/usr/lib/nagios/plugins# ./check_nrpe -u -H TGCS024 -t 30 -c checkveeambu -a 'Linux VM backup' 1
CHECK_NRPE: Socket timeout after 30 seconds.

Then this one
/usr/lib/nagios/plugins# ./check_nrpe -u -H TGCS024 -t 120 -c checkveeambu -a 'Linux VM backup' 1
Command checkveeambu didn't terminate within the timeout period 60s

Any ideas?
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by Box293 »

kwhogster wrote:Update
I found NRPE.CFG

made this change

Code: Select all

# COMMAND TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# allow plugins to finish executing before killing them off.

#command_timeout=60
command_timeout=150



# CONNECTION TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# wait for a connection to be established before exiting. This is sometimes
# seen where a network problem stops the SSL being established even though
# all network sessions are connected. This causes the nrpe daemons to
# accumulate, eating system resources. Do not set this too low.

connection_timeout=300


restarted the nrpe service

sudo /etc/init.d/nagios-nrpe-server restart
[ ok ] Restarting nagios-nrpe-server (via systemctl): nagios-nrpe-server.service.

I tried this one first
root@tgcs017:/usr/lib/nagios/plugins# ./check_nrpe -u -H TGCS024 -t 30 -c checkveeambu -a 'Linux VM backup' 1
CHECK_NRPE: Socket timeout after 30 seconds.

Then this one
/usr/lib/nagios/plugins# ./check_nrpe -u -H TGCS024 -t 120 -c checkveeambu -a 'Linux VM backup' 1
Command checkveeambu didn't terminate within the timeout period 60s

Any ideas?
This is the key test we need to resolve.

If it's saying it timed out within 60 seconds then the command_timeout argument on your NRPE client on the remote end is being ignored. Even though you said you restarted the service something is not right. I would restart the entire server completely and test again just to rule out the setting being applied.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by kwhogster »

Troy,

I restarted my ubuntu server that Nagios runs on same issue.
is it possible that I have more than one nrpe.cfg ?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by scottwilkerson »

There is also the service_check_timeout in the nagios.cfg on the nagios server you could be hitting.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by kwhogster »

Thanks

My Nagios.cfg

Code: Select all

# TIMEOUT VALUES
# These options control how much time Nagios will allow various
# types of commands to execute before killing them off.  Options
# are available for controlling maximum time allotted for
# service checks, host checks, event handlers, notifications, the
# ocsp command, and performance data commands.  All values are in
# seconds.

#service_check_timeout=120
service_check_timeout=240
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
I restarted the Nagios service after saving the Nagios.cfg file

I doubled the service_check_timeout
still same error

root@tgcs017:/usr/lib/nagios/plugins# ./check_nrpe -u -H TGCS024 -t 120 -c checkveeambu -a 'Linux VM backup' 1
Command checkveeambu didn't terminate within the timeout period 60s


This is very strange.

Tom
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by scottwilkerson »

On the remote system can you show the output of the following

Code: Select all

netstat -nlp|grep 5666
ps -ef|grep nrpe
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Command checkveeambu didn't terminate within the timeout

Post by scottwilkerson »

Wait, I just re-read your OP, I didn't realize this was a connection to NSClient++

Can you post your nscp.ini or nsclient.ini
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked