CRITICAL: Return code of 255 is out of bounds. (worker:)
-
yangzhiyao2653
- Posts: 27
- Joined: Fri May 18, 2018 1:15 am
CRITICAL: Return code of 255 is out of bounds. (worker:)
I am getting the error - CRITICAL: Return code of 255 is out of bounds. (worker: worker-name). Right underneath that error is this one: UNKNOWN - check_by_ssh: Remote command '$USER1$/check_by_ssh -E 1 -t 120 -l vi-admin -H $ARG1$ -C "~/box293_check_vmware.pl --timeout 120 --server $ARG2$ --check Host_Memory_Usage --host \"$HOSTADDRESS$\" --perfdata_option Memory_Free:1,Memory_Total:1,Memory_Used:1,Memory_Used%:1 --reporting_si \"$ARG4$\" --warning \"$ARG5$\" --critical \"$ARG6$\" \"$ARG7$\" \"$ARG8$\""' returned status 255.
I have a mod gearman worker running a check through a VMA (vmware management assistant) for checking ESXi host health, mem, cpu, ect... The check works fine, however fairly often it will throw the above errors for just a few seconds and then they clear up and back to everything being okay. I can run the check manually from the worker and have 100% success but Nagios is throwing these errors quite often. Any ideas on how to make this stop happening?
Just like it,but it didn't solve the actual problem. https://support.nagios.com/forum/viewto ... =6&t=43576
I have a mod gearman worker running a check through a VMA (vmware management assistant) for checking ESXi host health, mem, cpu, ect... The check works fine, however fairly often it will throw the above errors for just a few seconds and then they clear up and back to everything being okay. I can run the check manually from the worker and have 100% success but Nagios is throwing these errors quite often. Any ideas on how to make this stop happening?
Just like it,but it didn't solve the actual problem. https://support.nagios.com/forum/viewto ... =6&t=43576
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
Is it possible that any of your workers don't have shared keys with the server? I see it is calling check_by_ssh
This could have it to fail on one worker and not another.
This could have it to fail on one worker and not another.
-
yangzhiyao2653
- Posts: 27
- Joined: Fri May 18, 2018 1:15 am
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
I'm sure the worker there is no problem, because I have other queue also performed for the worker. I can run the check manually from the worker and have 100% success but Nagios is throwing these errors quite often.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
Is there any errors in the mod_gearman logs on the worker that is having this issue?
-
yangzhiyao2653
- Posts: 27
- Joined: Fri May 18, 2018 1:15 am
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
I think I found the reason, I'm mod_gearman worker set the timeout is 120 s, but see the timeout in the log is 60s, I restarted the service, but still is the result of this.Can you answer my doubts?
In the /etc/mod_gearman2/worker.conf ,the job_timeout=120,but in my mod_gearman_worker is timeout(60s)
In the /etc/mod_gearman2/worker.conf ,the job_timeout=120,but in my mod_gearman_worker is timeout(60s)
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
You may need to adjust your service_check_timeout setting in your nagios.cfg and then restart nagios
This is the maximum time nagios will wait for the service to return.
This is the maximum time nagios will wait for the service to return.
-
yangzhiyao2653
- Posts: 27
- Joined: Fri May 18, 2018 1:15 am
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
I have been configured for 120 seconds before, but the problem is still not improve.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
When you looked at the /etc/mod_gearman2/worker.conf for the job_timeout setting, did you look at the file on all of your workers?
Did you restart the mod_gearman workers after adjusting it up to 120?
Did you restart the mod_gearman workers after adjusting it up to 120?
-
yangzhiyao2653
- Posts: 27
- Joined: Fri May 18, 2018 1:15 am
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
I check is warning that the worker, but all the worker I have adjusted, and restart the mod-gearman2-worker and the gearmand services, nagios. cfg was adjusted to 120 seconds.I can't think of what can lead to appear the timeout.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: CRITICAL: Return code of 255 is out of bounds. (worker:)
Do you see the errors in the worker logs?