I am constantly getting this error
State: CRITICAL
Info:
ESX3 CRITICAL - HOST CPU Unknown error
Date/Time: 2015-06-25 15:14:29
I tried the stuff recommended here
https://support.nagios.com/forum/viewto ... or#p125599
but it did not help
after some time, the system retries and goes back to normal. in any given day happens like 40 times
Costantly getting Uknown Errors monitoing ESX
Re: Costantly getting Uknown Errors monitoing ESX
Does the check work every time when you run it from the command line? Can you show us the actual command, run from the CLI and the output of it? Are you using Mod Gearman?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Costantly getting Uknown Errors monitoing ESX
the error is random. if I run it from the CLI, it might work.
it seems like at times a get the error.
and yes, I am using mod_gearman
it seems like at times a get the error.
and yes, I am using mod_gearman
Re: Costantly getting Uknown Errors monitoing ESX
It is possible that the check might work locally, but it would fail when it's run from the remote worker if you haven't copied over the auth file. This can explain why it is working intermittently.if I run it from the CLI, it might work.
If this is a timeout issue, which is not very likely, you can try increasing the timeout to let's say 60 seconds (the default is 30).
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Costantly getting Uknown Errors monitoing ESX
I tried increasing the timeout to 90
the worker where this check is running on is on the same server where XI runs on. therefore it can not be an issue with the scripts
I can try moving this check to a different worker
I agree, this does not seem a time out issue.
the worker where this check is running on is on the same server where XI runs on. therefore it can not be an issue with the scripts
I can try moving this check to a different worker
I agree, this does not seem a time out issue.
Re: Costantly getting Uknown Errors monitoing ESX
What is the mod gearman version that you are currently using? Can you show us the worker config? Hide sensitive info.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Costantly getting Uknown Errors monitoing ESX
mod_gearman.x86_64 1.5.0b1-1.el6 @/mod_gearman-1.5.0b1-1.el6.x86_64
You do not have the required permissions to view the files attached to this post.
Re: Costantly getting Uknown Errors monitoing ESX
Can you check and see if you are running the latest VMWare Wizard?
Go in to Admin > Manage Config Wizards and see if the VMWare wizard is at the latest version.
Also, do you have your gearman settings set to only allow this check to happen on the local server only and not a worker?
Go in to Admin > Manage Config Wizards and see if the VMWare wizard is at the latest version.
Also, do you have your gearman settings set to only allow this check to happen on the local server only and not a worker?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Costantly getting Uknown Errors monitoing ESX
yes, I do have one worker
here is the version 1.6
here is the version 1.6
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Costantly getting Uknown Errors monitoing ESX
Hi Bosecorp
I reviewed your latest profile.zip and it appears that your esx servers should indeed be monitored by your primary XI instance and not a remote worker (assuming the configs haven't changed a lot)
I think the next step is to see if there are any verbose errors in your nagios.log
You might find a great deal of useless output in that command, but if this is happening 40 times a day you should see some output related to the issue. Take a look and if you see anything interesting point it out here. Otherwise we may have to turn debugging on at your local gearman worker and look there.
I reviewed your latest profile.zip and it appears that your esx servers should indeed be monitored by your primary XI instance and not a remote worker (assuming the configs haven't changed a lot)
I think the next step is to see if there are any verbose errors in your nagios.log
Code: Select all
grep -i esx /usr/local/nagios/var/nagios.log