Page 2 of 3
Re: After 2014 upgrade, host notifications disabled
Posted: Tue May 27, 2014 10:00 am
by scottwilkerson
Lets run
Code: Select all
service ndo2db stop
service nagios stop
killall -9 ndo2db
killall -9 nagios
service ndo2db start
service nagios start
then check the host/service count in the profile again.
Re: After 2014 upgrade, host notifications disabled
Posted: Tue May 27, 2014 10:23 am
by SavaSC
That was interesting. I ran the code as you asked, then logged back in to the web interface. Here is the message it gave on login:
Code: Select all
System Status Degraded!
One or more critical components of Nagios XI has been stopped, is disabled, or has malfunctioned. This can cause problems with monitoring, notifications, reporting, and more. You should investigate this problem immediately.
Check system status
Check monitoring engine status
I looked and everything was showing green. I logged off and back in, no error message.
Nagios is still showing:
Code: Select all
nagios (pid 28528) is running...
NPCD running (pid 3789).
ndo2db (pid 21947) is running...
CPU Load 15: 0.71
Total Hosts: 0
Total Services: 0
Function 'get_base_uri' returns: http://hou-nagiosxi/nagiosxi/
Function 'get_base_url' returns: http://hou-nagiosxi/nagiosxi/
Function 'get_backend_url(internal_call=false)' returns: http://hou-nagiosxi/nagiosxi/includes/components/profile/profile.php
Function 'get_backend_url(internal_call=true)' returns: http://localhost/nagiosxi/backend/
Re: After 2014 upgrade, host notifications disabled
Posted: Tue May 27, 2014 10:27 am
by sreinhardt
It is possible that the cron that updates to check for changes in system status happened while you were restarting those services and took a moment to catch up and see that everything was back to normal. Is your system behaving normally now, or are you still getting errors?
Re: After 2014 upgrade, host notifications disabled
Posted: Tue May 27, 2014 10:45 am
by SavaSC
Unfortunately, it is not acting any differently. It's still showing all service errors only under All Service Problems and the server status page always shows the servers as Notifications are Disabled for this Host (although the page for the host displays notifications as active).
Re: After 2014 upgrade, host notifications disabled
Posted: Tue May 27, 2014 12:38 pm
by lmiltchev
Can you post the nagios.cfg file?
Re: After 2014 upgrade, host notifications disabled
Posted: Wed May 28, 2014 1:26 pm
by SavaSC
Sure, please tell me where it is. I've looked around, but seem to be missing it.
Re: After 2014 upgrade, host notifications disabled
Posted: Wed May 28, 2014 1:32 pm
by slansing
Well, Nagios should not be running if it is missing, it should be at:
Re: After 2014 upgrade, host notifications disabled
Posted: Wed May 28, 2014 2:27 pm
by SavaSC
I meant I was not seeing it, not that I thought it wasn't there. Bad phrasing on my part.
Here it is. Thanks!
Re: After 2014 upgrade, host notifications disabled
Posted: Wed May 28, 2014 3:02 pm
by lmiltchev
The main config looks fine. Let's check a few more things. Is mysqld running?
Do you still see errors in the mysqld.log?
Run the following commands and show us the output:
Code: Select all
tail -50 /var/log/messages
tail -50 /usr/local/nagios/var/nagios.log
Re: After 2014 upgrade, host notifications disabled
Posted: Wed May 28, 2014 3:20 pm
by SavaSC
Here is the mysqld status
Code: Select all
[root@ltc099l components]# service mysqld status
mysqld (pid 1301) is running...
The mysqld.log
Code: Select all
[root@ltc099l components]# tail -25 /var/log/mysqld.log
140528 13:06:19 InnoDB: Shutdown completed; log sequence number 0 43655
140528 13:06:19 [Note] /usr/libexec/mysqld: Shutdown complete
140528 13:06:19 mysqld ended
140528 13:06:34 mysqld started
140528 13:06:34 [Warning] option 'max_join_size': unsigned value 18446744073709551615 adjusted to 4294967295
140528 13:06:34 [Warning] option 'max_join_size': unsigned value 18446744073709551615 adjusted to 4294967295
140528 13:06:35 InnoDB: Started; log sequence number 0 43655
140528 13:06:35 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.77' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
140528 13:13:45 [Note] /usr/libexec/mysqld: Normal shutdown
140528 13:13:47 InnoDB: Starting shutdown...
140528 13:13:47 InnoDB: Shutdown completed; log sequence number 0 43655
140528 13:13:47 [Note] /usr/libexec/mysqld: Shutdown complete
140528 13:13:47 mysqld ended
140528 13:13:48 mysqld started
140528 13:13:48 [Warning] option 'max_join_size': unsigned value 18446744073709551615 adjusted to 4294967295
140528 13:13:48 [Warning] option 'max_join_size': unsigned value 18446744073709551615 adjusted to 4294967295
140528 13:13:48 InnoDB: Started; log sequence number 0 43655
140528 13:13:48 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.77' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
Messages...
Code: Select all
[root@ltc099l components]# tail -50 /var/log/messages
May 28 13:44:12 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Temperature;OK;SOFT;2;Temp: 73.1 F
May 28 13:44:12 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;OK;SOFT;2;xi_service_event_handler
May 28 13:59:28 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;Unable to read sensor
May 28 13:59:28 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;xi_service_event_handler
May 28 14:00:13 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Temperature;OK;SOFT;2;Temp: 73.5 F
May 28 14:00:13 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;OK;SOFT;2;xi_service_event_handler
May 28 14:03:57 ltc099l nagios: Auto-save of retention data completed successfully.
May 28 14:05:18 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Humidity;UNKNOWN;SOFT;1;Unable to read sensor
May 28 14:05:18 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;UNKNOWN;SOFT;1;xi_service_event_handler
May 28 14:06:18 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Humidity;UNKNOWN;SOFT;2;Unable to read sensor
May 28 14:06:18 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;UNKNOWN;SOFT;2;xi_service_event_handler
May 28 14:07:03 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Humidity;OK;SOFT;3;Humidity: 40.5%
May 28 14:07:03 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;OK;SOFT;3;xi_service_event_handler
May 28 14:09:51 ltc099l nagios: SERVICE ALERT: LTC030M;CPU Usage;WARNING;SOFT;1;No data was received from host!
May 28 14:09:51 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;CPU Usage;WARNING;SOFT;1;xi_service_event_handler
May 28 14:09:53 ltc099l nagios: SERVICE ALERT: LTC030M;Page File Usage;WARNING;SOFT;1;No data was received from host!
May 28 14:09:53 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;Page File Usage;WARNING;SOFT;1;xi_service_event_handler
May 28 14:10:01 ltc099l nagios: SERVICE ALERT: LTC030M;Logon Errors;WARNING;SOFT;1;No data was received from host!
May 28 14:10:01 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;Logon Errors;WARNING;SOFT;1;xi_service_event_handler
May 28 14:10:42 ltc099l nagios: SERVICE ALERT: LTC030M;CPU Usage;OK;SOFT;2;CPU Load 3% (5 min average)
May 28 14:10:42 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;CPU Usage;OK;SOFT;2;xi_service_event_handler
May 28 14:10:44 ltc099l nagios: SERVICE ALERT: LTC030M;Page File Usage;OK;SOFT;2;Paging File usage is 55.02 %
May 28 14:10:44 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;Page File Usage;OK;SOFT;2;xi_service_event_handler
May 28 14:10:53 ltc099l nagios: SERVICE ALERT: LTC030M;Logon Errors;OK;SOFT;2;Login Errors since last reboot is 0
May 28 14:10:53 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;Logon Errors;OK;SOFT;2;xi_service_event_handler
May 28 14:25:28 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;Unable to read sensor
May 28 14:25:28 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;xi_service_event_handler
May 28 14:26:13 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Temperature;OK;SOFT;2;Temp: 72.6 F
May 28 14:26:13 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;OK;SOFT;2;xi_service_event_handler
May 28 14:32:18 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Humidity;UNKNOWN;SOFT;1;Unable to read sensor
May 28 14:32:18 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;UNKNOWN;SOFT;1;xi_service_event_handler
May 28 14:33:03 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Humidity;OK;SOFT;2;Humidity: 39.9%
May 28 14:33:03 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;OK;SOFT;2;xi_service_event_handler
May 28 14:49:15 ltc099l nagios: HOST ALERT: BAL-RTR-INT;DOWN;SOFT;1;CRITICAL - 10.225.102.1: rta nan, lost 100%
May 28 14:49:15 ltc099l nagios: GLOBAL HOST EVENT HANDLER: BAL-RTR-INT;DOWN;SOFT;1;xi_host_event_handler
May 28 14:50:15 ltc099l nagios: HOST ALERT: BAL-RTR-INT;UP;SOFT;2;OK - 10.225.102.1: rta 60.437ms, lost 0%
May 28 14:50:15 ltc099l nagios: GLOBAL HOST EVENT HANDLER: BAL-RTR-INT;UP;SOFT;2;xi_host_event_handler
May 28 15:03:57 ltc099l nagios: Auto-save of retention data completed successfully.
May 28 15:06:27 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;Unable to read sensor
May 28 15:06:27 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;xi_service_event_handler
May 28 15:07:12 ltc099l nagios: SERVICE ALERT: ATL-Websensor;Temperature;OK;SOFT;2;Temp: 70.0 F
May 28 15:07:12 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;OK;SOFT;2;xi_service_event_handler
May 28 15:15:51 ltc099l nagios: SERVICE ALERT: LTC030M;CPU Usage;WARNING;SOFT;1;No data was received from host!
May 28 15:15:51 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;CPU Usage;WARNING;SOFT;1;xi_service_event_handler
May 28 15:15:52 ltc099l nagios: SERVICE ALERT: LTC030M;Page File Usage;WARNING;SOFT;1;No data was received from host!
May 28 15:15:52 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;Page File Usage;WARNING;SOFT;1;xi_service_event_handler
May 28 15:16:42 ltc099l nagios: SERVICE ALERT: LTC030M;CPU Usage;OK;SOFT;2;CPU Load 2% (5 min average)
May 28 15:16:42 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;CPU Usage;OK;SOFT;2;xi_service_event_handler
May 28 15:16:44 ltc099l nagios: SERVICE ALERT: LTC030M;Page File Usage;OK;SOFT;2;Paging File usage is 54.88 %
May 28 15:16:44 ltc099l nagios: GLOBAL SERVICE EVENT HANDLER: LTC030M;Page File Usage;OK;SOFT;2;xi_service_event_handler
And finally, the nagios.log
Code: Select all
[root@ltc099l components]# tail -50 /usr/local/nagios/var/nagios.log
[1401302652] SERVICE ALERT: ATL-Websensor;Temperature;OK;SOFT;2;Temp: 73.1 F
[1401302652] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;OK;SOFT;2;xi_service_event_handler
[1401303568] SERVICE ALERT: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;Unable to read sensor
[1401303568] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;xi_service_event_handler
[1401303613] SERVICE ALERT: ATL-Websensor;Temperature;OK;SOFT;2;Temp: 73.5 F
[1401303613] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;OK;SOFT;2;xi_service_event_handler
[1401303837] Auto-save of retention data completed successfully.
[1401303918] SERVICE ALERT: ATL-Websensor;Humidity;UNKNOWN;SOFT;1;Unable to read sensor
[1401303918] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;UNKNOWN;SOFT;1;xi_service_event_handler
[1401303978] SERVICE ALERT: ATL-Websensor;Humidity;UNKNOWN;SOFT;2;Unable to read sensor
[1401303978] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;UNKNOWN;SOFT;2;xi_service_event_handler
[1401304023] SERVICE ALERT: ATL-Websensor;Humidity;OK;SOFT;3;Humidity: 40.5%
[1401304023] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;OK;SOFT;3;xi_service_event_handler
[1401304191] SERVICE ALERT: LTC030M;CPU Usage;WARNING;SOFT;1;No data was received from host!
[1401304191] GLOBAL SERVICE EVENT HANDLER: LTC030M;CPU Usage;WARNING;SOFT;1;xi_service_event_handler
[1401304193] SERVICE ALERT: LTC030M;Page File Usage;WARNING;SOFT;1;No data was received from host!
[1401304193] GLOBAL SERVICE EVENT HANDLER: LTC030M;Page File Usage;WARNING;SOFT;1;xi_service_event_handler
[1401304201] SERVICE ALERT: LTC030M;Logon Errors;WARNING;SOFT;1;No data was received from host!
[1401304201] GLOBAL SERVICE EVENT HANDLER: LTC030M;Logon Errors;WARNING;SOFT;1;xi_service_event_handler
[1401304242] SERVICE ALERT: LTC030M;CPU Usage;OK;SOFT;2;CPU Load 3% (5 min average)
[1401304242] GLOBAL SERVICE EVENT HANDLER: LTC030M;CPU Usage;OK;SOFT;2;xi_service_event_handler
[1401304244] SERVICE ALERT: LTC030M;Page File Usage;OK;SOFT;2;Paging File usage is 55.02 %
[1401304244] GLOBAL SERVICE EVENT HANDLER: LTC030M;Page File Usage;OK;SOFT;2;xi_service_event_handler
[1401304253] SERVICE ALERT: LTC030M;Logon Errors;OK;SOFT;2;Login Errors since last reboot is 0
[1401304253] GLOBAL SERVICE EVENT HANDLER: LTC030M;Logon Errors;OK;SOFT;2;xi_service_event_handler
[1401305128] SERVICE ALERT: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;Unable to read sensor
[1401305128] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;xi_service_event_handler
[1401305173] SERVICE ALERT: ATL-Websensor;Temperature;OK;SOFT;2;Temp: 72.6 F
[1401305173] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;OK;SOFT;2;xi_service_event_handler
[1401305538] SERVICE ALERT: ATL-Websensor;Humidity;UNKNOWN;SOFT;1;Unable to read sensor
[1401305538] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;UNKNOWN;SOFT;1;xi_service_event_handler
[1401305583] SERVICE ALERT: ATL-Websensor;Humidity;OK;SOFT;2;Humidity: 39.9%
[1401305583] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Humidity;OK;SOFT;2;xi_service_event_handler
[1401306555] HOST ALERT: BAL-RTR-INT;DOWN;SOFT;1;CRITICAL - 10.225.102.1: rta nan, lost 100%
[1401306555] GLOBAL HOST EVENT HANDLER: BAL-RTR-INT;DOWN;SOFT;1;xi_host_event_handler
[1401306615] HOST ALERT: BAL-RTR-INT;UP;SOFT;2;OK - 10.225.102.1: rta 60.437ms, lost 0%
[1401306615] GLOBAL HOST EVENT HANDLER: BAL-RTR-INT;UP;SOFT;2;xi_host_event_handler
[1401307437] Auto-save of retention data completed successfully.
[1401307587] SERVICE ALERT: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;Unable to read sensor
[1401307587] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;UNKNOWN;SOFT;1;xi_service_event_handler
[1401307632] SERVICE ALERT: ATL-Websensor;Temperature;OK;SOFT;2;Temp: 70.0 F
[1401307632] GLOBAL SERVICE EVENT HANDLER: ATL-Websensor;Temperature;OK;SOFT;2;xi_service_event_handler
[1401308151] SERVICE ALERT: LTC030M;CPU Usage;WARNING;SOFT;1;No data was received from host!
[1401308151] GLOBAL SERVICE EVENT HANDLER: LTC030M;CPU Usage;WARNING;SOFT;1;xi_service_event_handler
[1401308152] SERVICE ALERT: LTC030M;Page File Usage;WARNING;SOFT;1;No data was received from host!
[1401308152] GLOBAL SERVICE EVENT HANDLER: LTC030M;Page File Usage;WARNING;SOFT;1;xi_service_event_handler
[1401308202] SERVICE ALERT: LTC030M;CPU Usage;OK;SOFT;2;CPU Load 2% (5 min average)
[1401308202] GLOBAL SERVICE EVENT HANDLER: LTC030M;CPU Usage;OK;SOFT;2;xi_service_event_handler
[1401308204] SERVICE ALERT: LTC030M;Page File Usage;OK;SOFT;2;Paging File usage is 54.88 %
[1401308204] GLOBAL SERVICE EVENT HANDLER: LTC030M;Page File Usage;OK;SOFT;2;xi_service_event_handler
Thank you for taking a look.