Page 1 of 2

XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Wed Oct 10, 2012 10:45 am
by GldRush98
We are monitoring our main XI server with a secondary XI server.
I upgraded the secondary XI server to 2012R1.0 today, and it seems to have broken the XI server monitoring.
I have also upgraded the main XI server to 2012R1.0 hoping it might start working again, but it did not.
I also tried removing the services and host from the secondary server, and re-adding them using the Nagios XI config wizard, but this also did not work.
Image1.png
Any ideas?

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Wed Oct 10, 2012 11:16 am
by lmiltchev
Can you check the apache logs on both machines and see if there is something there that can point us to the right direction?

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Wed Oct 10, 2012 11:29 am
by GldRush98
Seeing this in access_log on the main XI server... this would be happening when the check is coming from the secondary server:

Code: Select all

xx.xx.xx.xx - - [10/Oct/2012:11:27:31 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:11:27:36 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:11:27:39 -0500] "\x16\x03\x01" 302 2538 "-" "-"
(I redacted our external IP there)

Not seeing anything else relevant pop up in access_log or error_log when running the check.

Note: My main XI server that is being checked uses https

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Wed Oct 10, 2012 11:59 am
by lmiltchev
Can you also check the "/var/log/httpd/ssl_access_log"?

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Wed Oct 10, 2012 12:42 pm
by GldRush98
For whatever reason, it is empty.

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Wed Oct 10, 2012 12:55 pm
by mguthrie
There should be some php error output on one of the machine in the /var/log/httpd/error_log related to these checks.

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Wed Oct 10, 2012 2:47 pm
by GldRush98
This is the only thing I am seeing when running the check...
This is showing up on the main XI server...
error_log:

Code: Select all

[Wed Oct 10 14:46:59 2012] [error] [client xx.xx.xx.xx] PHP Notice:  Undefined index:  HTTP_HOST in /var/www/html/index.php on line 4
[Wed Oct 10 14:47:00 2012] [error] [client xx.xx.xx.xx] PHP Notice:  Undefined index:  HTTP_HOST in /var/www/html/index.php on line 4
[Wed Oct 10 14:47:00 2012] [error] [client xx.xx.xx.xx] PHP Notice:  Undefined index:  HTTP_HOST in /var/www/html/index.php on line 4
[Wed Oct 10 14:47:00 2012] [error] [client xx.xx.xx.xx] PHP Notice:  Undefined index:  HTTP_HOST in /var/www/html/index.php on line 4
[Wed Oct 10 14:47:01 2012] [error] [client xx.xx.xx.xx] PHP Notice:  Undefined index:  HTTP_HOST in /var/www/html/index.php on line 4
[Wed Oct 10 14:47:01 2012] [error] [client xx.xx.xx.xx] PHP Notice:  Undefined index:  HTTP_HOST in /var/www/html/index.php on line 4
[Wed Oct 10 14:47:02 2012] [error] [client xx.xx.xx.xx] PHP Notice:  Undefined index:  HTTP_HOST in /var/www/html/index.php on line 4
access_log:

Code: Select all

xx.xx.xx.xx - - [10/Oct/2012:14:48:33 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:35 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:38 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:38 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:38 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:38 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:39 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:39 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:39 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:39 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:40 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:40 -0500] "\x16\x03\x01" 302 2538 "-" "-"
xx.xx.xx.xx - - [10/Oct/2012:14:48:40 -0500] "\x16\x03\x01" 302 2538 "-" "-"
My guess is it has to do with running https on the main Nagios server... but I don't get why everything worked fine in 2011, but when I upgraded the secondary server to 2012 it broke. Hmmm...

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Wed Oct 10, 2012 4:32 pm
by mguthrie
How about the error log on the remote machine? (The one that the checks are running against)

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Thu Oct 11, 2012 8:13 am
by GldRush98
Sorry, I should be more clear about my terminology...
Main XI server = Our XI server with all of our hosts we monitor on it.
Secondary XI Server = This server only monitors the main XI server.

Those logs are from the main server (the one being checked).
Nothing shows up in the secondary server's logs when it is running the check.

It broke when I upgraded the secondary to 2012. The main XI server was still on 2011. I then upgraded the main XI to 2012 thinking it might resolve the issue, but no go.

Re: XI 2012R1.0 Breaks monitoring another Nagios XI server

Posted: Thu Oct 11, 2012 8:40 am
by mguthrie
We'll run some tests and see if we can recreate this...