Database Backend Status Red
Posted: Thu Nov 19, 2020 10:49 pm
Hi,
XI version 5.6.14
Centos 7.7 (64-bit)
VMware image
No special config
Discovered today that our instance of XI is showing status Red for the system component "Database Backend". I was due to do an upgrade to latest today anyway so I took a VM snapshot and proceeded with the upgrade. Upgrade appeared to succeed but after I was unable to login to, or really even view, the GUI anymore. There was an SQL error as well as a notice advising I needed to reactivate the license. I tried a number of things I'd found in forum posts but nothing worked, so I reverted to the snapshot. Up to this point I had tried the upgrade twice, once via the GUI and once via the command line (based on this document: https://assets.nagios.com/downloads/nag ... ctions.pdf). Same result both times. I also did a component specific upgrade of ndo after the second full upgrade attempt.
Current status is that I'm back on 5.6.14 with everything looking OK - I can login and view all XI elements again - but the Database Backend component is remaining with status red. I figure, based on my earlier experience, that I need to solve this before I attempt to upgrade again.
Once again, I've tried a number of things I found in forum posts that seemed to be related, but no change.
This is a DR instance, so the only thing it's monitoring is the Production instance (XI 5.7.3), so not super-urgent. The Production instance is showing a check result for the DR instance service "Nagios XI Daemons" as "ndo2db (Database Backend) stopped". I've searched a number of forum posts related to ndo2db but have not yet found a solution that works.
I can't remember if 5.6.14 is supposed to still have a running ndo2db service, but when I try to start it I receive the error "Failed to start ndo2db.service: Unit not found". Also, nagios log contains a number of these messages:
[Fri Nov 20 12:17:33 2020] ndomod: Could not open data sink! I'll keep trying, but some output may get lost...
[Fri Nov 20 12:17:33 2020] ndomod registered for process data
[Fri Nov 20 12:17:33 2020] ndomod registered for log data'
[Fri Nov 20 12:17:33 2020] ndomod registered for system command data'
[Fri Nov 20 12:17:33 2020] ndomod registered for event handler data'
[Fri Nov 20 12:17:33 2020] ndomod registered for notification data'
[Fri Nov 20 12:17:33 2020] ndomod registered for comment data'
[Fri Nov 20 12:17:33 2020] ndomod registered for downtime data'
[Fri Nov 20 12:17:33 2020] ndomod registered for flapping data'
[Fri Nov 20 12:17:33 2020] ndomod registered for program status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for host status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for service status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for adaptive program data'
[Fri Nov 20 12:17:33 2020] ndomod registered for adaptive host data'
[Fri Nov 20 12:17:33 2020] ndomod registered for adaptive service data'
[Fri Nov 20 12:17:33 2020] ndomod registered for external command data'
[Fri Nov 20 12:17:33 2020] ndomod registered for aggregated status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for retention data'
[Fri Nov 20 12:17:33 2020] ndomod registered for contact data'
[Fri Nov 20 12:17:33 2020] ndomod registered for contact notification data'
[Fri Nov 20 12:17:33 2020] ndomod registered for acknowledgement data'
[Fri Nov 20 12:17:33 2020] ndomod registered for state change data'
[Fri Nov 20 12:17:33 2020] ndomod registered for contact status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for adaptive contact data'
[Fri Nov 20 12:17:33 2020] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[Fri Nov 20 12:17:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:22:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:27:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:32:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:32:48 2020] ndomod: Still unable to connect to data sink. 853 items lost, 5000 queued items to flush.
[Fri Nov 20 12:37:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:42:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:47:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:47:49 2020] ndomod: Still unable to connect to data sink. 2092 items lost, 5000 queued items to flush.
Any help appreciated and thanks in advance.
Ben.
[UPDATE] Assuming it's related, but I've just noticed the XI GUI doesn't appear to be updating with regard to Hosts/Services. Had to go into the Core GUI to acknowledge the ndo2db service problem. It is showing as OK in the XI GUI and the Last Check timestamp is very old for all Hosts/Services.
XI version 5.6.14
Centos 7.7 (64-bit)
VMware image
No special config
Discovered today that our instance of XI is showing status Red for the system component "Database Backend". I was due to do an upgrade to latest today anyway so I took a VM snapshot and proceeded with the upgrade. Upgrade appeared to succeed but after I was unable to login to, or really even view, the GUI anymore. There was an SQL error as well as a notice advising I needed to reactivate the license. I tried a number of things I'd found in forum posts but nothing worked, so I reverted to the snapshot. Up to this point I had tried the upgrade twice, once via the GUI and once via the command line (based on this document: https://assets.nagios.com/downloads/nag ... ctions.pdf). Same result both times. I also did a component specific upgrade of ndo after the second full upgrade attempt.
Current status is that I'm back on 5.6.14 with everything looking OK - I can login and view all XI elements again - but the Database Backend component is remaining with status red. I figure, based on my earlier experience, that I need to solve this before I attempt to upgrade again.
Once again, I've tried a number of things I found in forum posts that seemed to be related, but no change.
This is a DR instance, so the only thing it's monitoring is the Production instance (XI 5.7.3), so not super-urgent. The Production instance is showing a check result for the DR instance service "Nagios XI Daemons" as "ndo2db (Database Backend) stopped". I've searched a number of forum posts related to ndo2db but have not yet found a solution that works.
I can't remember if 5.6.14 is supposed to still have a running ndo2db service, but when I try to start it I receive the error "Failed to start ndo2db.service: Unit not found". Also, nagios log contains a number of these messages:
[Fri Nov 20 12:17:33 2020] ndomod: Could not open data sink! I'll keep trying, but some output may get lost...
[Fri Nov 20 12:17:33 2020] ndomod registered for process data
[Fri Nov 20 12:17:33 2020] ndomod registered for log data'
[Fri Nov 20 12:17:33 2020] ndomod registered for system command data'
[Fri Nov 20 12:17:33 2020] ndomod registered for event handler data'
[Fri Nov 20 12:17:33 2020] ndomod registered for notification data'
[Fri Nov 20 12:17:33 2020] ndomod registered for comment data'
[Fri Nov 20 12:17:33 2020] ndomod registered for downtime data'
[Fri Nov 20 12:17:33 2020] ndomod registered for flapping data'
[Fri Nov 20 12:17:33 2020] ndomod registered for program status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for host status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for service status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for adaptive program data'
[Fri Nov 20 12:17:33 2020] ndomod registered for adaptive host data'
[Fri Nov 20 12:17:33 2020] ndomod registered for adaptive service data'
[Fri Nov 20 12:17:33 2020] ndomod registered for external command data'
[Fri Nov 20 12:17:33 2020] ndomod registered for aggregated status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for retention data'
[Fri Nov 20 12:17:33 2020] ndomod registered for contact data'
[Fri Nov 20 12:17:33 2020] ndomod registered for contact notification data'
[Fri Nov 20 12:17:33 2020] ndomod registered for acknowledgement data'
[Fri Nov 20 12:17:33 2020] ndomod registered for state change data'
[Fri Nov 20 12:17:33 2020] ndomod registered for contact status data'
[Fri Nov 20 12:17:33 2020] ndomod registered for adaptive contact data'
[Fri Nov 20 12:17:33 2020] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[Fri Nov 20 12:17:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:22:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:27:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:32:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:32:48 2020] ndomod: Still unable to connect to data sink. 853 items lost, 5000 queued items to flush.
[Fri Nov 20 12:37:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:42:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:47:37 2020] Warning: Return code of 4 for check of service 'Service Status - ndo2db' on host 'localhost' was out of bounds.
[Fri Nov 20 12:47:49 2020] ndomod: Still unable to connect to data sink. 2092 items lost, 5000 queued items to flush.
Any help appreciated and thanks in advance.
Ben.
[UPDATE] Assuming it's related, but I've just noticed the XI GUI doesn't appear to be updating with regard to Hosts/Services. Had to go into the Core GUI to acknowledge the ndo2db service problem. It is showing as OK in the XI GUI and the Last Check timestamp is very old for all Hosts/Services.