Nagios XI & Core not reflecting the same status on Services

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
JakeHatMacys
Posts: 281
Joined: Thu Sep 25, 2014 3:21 pm

Nagios XI & Core not reflecting the same status on Services

Post by JakeHatMacys »

I have a script I'm running in Core fine but when I check nagios XI it's always red stating it's timing out.... I can log in manually fine as well. Ever seen anything like this?
Capture.JPG
When I run it manually in core it comes back fine though... makes no sense to me:
Capture1.JPG
Thoughts? Script is basically just logging in via SSH and doing a df -v and taking some metrics. Works like a charm for 90% of our servers but trying to work through our problem children and this is a scenario we're seeing.
You do not have the required permissions to view the files attached to this post.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Nagios XI & Core not reflecting the same status on Servi

Post by rkennedy »

The web check versus the check that XI is using different usernames, what username did you establish the ssh key for?
Former Nagios Employee
JakeHatMacys
Posts: 281
Joined: Thu Sep 25, 2014 3:21 pm

Re: Nagios XI & Core not reflecting the same status on Servi

Post by JakeHatMacys »

rkennedy wrote:The web check versus the check that XI is using different usernames, what username did you establish the ssh key for?
We actually don't use that, the security team shot down using SSH keys. We're using a home brew'd shell script, would the web UI kick that off differently?

And again out of roughly 13,000 service running only about 600 of these are failing, I can't say that it's due to this every time. But we're trying to sort out the used cases and this seems to be an oddity we're running into. Testing it via core works like a charm but XI keeps coming back timed out after 60 seconds.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Nagios XI & Core not reflecting the same status on Servi

Post by rkennedy »

JakeHatMacys wrote:
rkennedy wrote:The web check versus the check that XI is using different usernames, what username did you establish the ssh key for?
We actually don't use that, the security team shot down using SSH keys. We're using a home brew'd shell script, would the web UI kick that off differently?

And again out of roughly 13,000 service running only about 600 of these are failing, I can't say that it's due to this every time. But we're trying to sort out the used cases and this seems to be an oddity we're running into. Testing it via core works like a charm but XI keeps coming back timed out after 60 seconds.
To clarify, I believe you're testing in the CCM (not core). The CCM will use a different username to run the script versus running over the CLI / as a Nagios check.

What are the full permissions on the file on these servers that aren't working?
Former Nagios Employee
JakeHatMacys
Posts: 281
Joined: Thu Sep 25, 2014 3:21 pm

Re: Nagios XI & Core not reflecting the same status on Servi

Post by JakeHatMacys »

rkennedy wrote:
JakeHatMacys wrote:
rkennedy wrote:The web check versus the check that XI is using different usernames, what username did you establish the ssh key for?
We actually don't use that, the security team shot down using SSH keys. We're using a home brew'd shell script, would the web UI kick that off differently?

And again out of roughly 13,000 service running only about 600 of these are failing, I can't say that it's due to this every time. But we're trying to sort out the used cases and this seems to be an oddity we're running into. Testing it via core works like a charm but XI keeps coming back timed out after 60 seconds.
To clarify, I believe you're testing in the CCM (not core). The CCM will use a different username to run the script versus running over the CLI / as a Nagios check.

What are the full permissions on the file on these servers that aren't working?
The script is located on our Nagios server in our libexec directory. We actually log into the servers using SSHPASS 1.05 (via the script), I can give you the local file permissions in a bit (we're currently migrating the server to another VM cluster to help with I/O performance)
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Nagios XI & Core not reflecting the same status on Servi

Post by rkennedy »

Sounds good - I'll watch for them. Usually errors like this are related to permissions.
Former Nagios Employee
Locked