Page 1 of 1

Check Successful in Testing But Not Live

Posted: Mon Nov 21, 2016 2:36 pm
by JNelson
When I test the check_website_response.sh plugin in "Test Check Command" and from the command line of the Nagios server, everything is fine, but the live check times out every time.

Test Check Command results in Web GUI:
COMMAND: /usr/local/nagios/libexec/check_website_response.sh -w 1000 -c 1500 -u http://MYURL
OUTPUT: RESPONSE: OK - 15 ms|Response=15ms;1000;1500;0


Command line of server:
[nagios@MYNAGIOSSERVER ~]$ /usr/local/nagios/libexec/check_website_response.sh -w 1000 -c 1500 -u http://MYURL
RESPONSE: OK - 12 ms|Response=12ms;1000;1500;0


On live check (as per attachment):
Critical (Service check timed out after 60.01 seconds)
Service Check Times Out.png
To make matters more confusing, every once in a long while a live check will be successful, but it's 100% successful with the test checking. The live checks were successful for years before they suddenly stopped behaving. No changes were made to the Nagios server.

We're running Nagios XI 2014R2.6. Anyone have any advice for troubleshooting this?

Re: Check Successful in Testing But Not Live

Posted: Mon Nov 21, 2016 3:50 pm
by dwhitfield
Could you PM me your profile? Admin > System Config > System Profile (click "Show Profile" in XI 5 onwards)

After you PM the profile, please update this thread so it will return to our dashboard. Thanks!

UPDATE: Profile received and shared with techs.

Re: Check Successful in Testing But Not Live

Posted: Mon Nov 21, 2016 4:05 pm
by JNelson
I have PMed the profile to you as requested.

Re: Check Successful in Testing But Not Live

Posted: Mon Nov 21, 2016 4:21 pm
by dwhitfield
You've got errors, notices, and warnings all over the place. Presently, it's not clear if these are previous issues you've worked through.

Is this an offline install? If not, is there any reason you haven't updated to 5? Considering the amount of errors I am seeing, and the amount of bug fixes in the 5.x series, in the abstract, I think it would be good to upgrade.

However, before you seriously consider the upgrade, what is the output of grep -R 'dbtype' /usr/local/nagiosxi/html/config.inc.php?

If pgsql is in there, then the upgrade won't be a pleasant experience. That said, you could still update to Nagios XI *2014R2.7*.

If you do proceed with the upgrade and the issue persists, please send me a new profile, again updating the thread so it comes back on our dashboard.

Re: Check Successful in Testing But Not Live

Posted: Wed Nov 23, 2016 3:15 pm
by JNelson
Thanks for your response. Yes, we have PGSQL. We're currently investigating the process for upgrading Nagios, but it may take several weeks to plan and execute.

Re: Check Successful in Testing But Not Live

Posted: Mon Nov 28, 2016 12:25 pm
by dwhitfield
In case you have an offline install, 5.3.3 came with an offline installer (the first in the 5.3 series).

Could you clarify if you are experiencing any other issues? We could go through your errors piece by piece, but if you have other known issues, it might be easier to pair the information in the logs with those issues.

Dealing with pgsql on historical installs is on our radar. I'm not sure the exact reasoning for the upgrade (there are lots of potential reasons!) but one option would just be to wait until our upgrade scripts handle pgsql better. Unfortunately, I do not have an ETA on when we will have that finished.

Thanks!