I have a script I'm running in Core fine but when I check nagios XI it's always red stating it's timing out.... I can log in manually fine as well. Ever seen anything like this?
When I run it manually in core it comes back fine though... makes no sense to me:
Thoughts? Script is basically just logging in via SSH and doing a df -v and taking some metrics. Works like a charm for 90% of our servers but trying to work through our problem children and this is a scenario we're seeing.
Nagios XI & Core not reflecting the same status on Services
-
JakeHatMacys
- Posts: 281
- Joined: Thu Sep 25, 2014 3:21 pm
Nagios XI & Core not reflecting the same status on Services
You do not have the required permissions to view the files attached to this post.
Re: Nagios XI & Core not reflecting the same status on Servi
The web check versus the check that XI is using different usernames, what username did you establish the ssh key for?
Former Nagios Employee
-
JakeHatMacys
- Posts: 281
- Joined: Thu Sep 25, 2014 3:21 pm
Re: Nagios XI & Core not reflecting the same status on Servi
We actually don't use that, the security team shot down using SSH keys. We're using a home brew'd shell script, would the web UI kick that off differently?rkennedy wrote:The web check versus the check that XI is using different usernames, what username did you establish the ssh key for?
And again out of roughly 13,000 service running only about 600 of these are failing, I can't say that it's due to this every time. But we're trying to sort out the used cases and this seems to be an oddity we're running into. Testing it via core works like a charm but XI keeps coming back timed out after 60 seconds.
Re: Nagios XI & Core not reflecting the same status on Servi
To clarify, I believe you're testing in the CCM (not core). The CCM will use a different username to run the script versus running over the CLI / as a Nagios check.JakeHatMacys wrote:We actually don't use that, the security team shot down using SSH keys. We're using a home brew'd shell script, would the web UI kick that off differently?rkennedy wrote:The web check versus the check that XI is using different usernames, what username did you establish the ssh key for?
And again out of roughly 13,000 service running only about 600 of these are failing, I can't say that it's due to this every time. But we're trying to sort out the used cases and this seems to be an oddity we're running into. Testing it via core works like a charm but XI keeps coming back timed out after 60 seconds.
What are the full permissions on the file on these servers that aren't working?
Former Nagios Employee
-
JakeHatMacys
- Posts: 281
- Joined: Thu Sep 25, 2014 3:21 pm
Re: Nagios XI & Core not reflecting the same status on Servi
The script is located on our Nagios server in our libexec directory. We actually log into the servers using SSHPASS 1.05 (via the script), I can give you the local file permissions in a bit (we're currently migrating the server to another VM cluster to help with I/O performance)rkennedy wrote:To clarify, I believe you're testing in the CCM (not core). The CCM will use a different username to run the script versus running over the CLI / as a Nagios check.JakeHatMacys wrote:We actually don't use that, the security team shot down using SSH keys. We're using a home brew'd shell script, would the web UI kick that off differently?rkennedy wrote:The web check versus the check that XI is using different usernames, what username did you establish the ssh key for?
And again out of roughly 13,000 service running only about 600 of these are failing, I can't say that it's due to this every time. But we're trying to sort out the used cases and this seems to be an oddity we're running into. Testing it via core works like a charm but XI keeps coming back timed out after 60 seconds.
What are the full permissions on the file on these servers that aren't working?
Re: Nagios XI & Core not reflecting the same status on Servi
Sounds good - I'll watch for them. Usually errors like this are related to permissions.
Former Nagios Employee