Page 1 of 1

Nagios keep failing and requesting DB repair every 20-30 min

Posted: Thu Jan 21, 2021 6:40 pm
by dlukinski
Hello

Our Nagios installation keeps failing and requesting DB repair every 20-30 min.
Repairs and reboots don't help

What to do next?

Re: Nagios keep failing and requesting DB repair every 20-30

Posted: Fri Jan 22, 2021 12:29 pm
by benjaminsmith
Hi @dlukinski,

There may be a table corrupted beyond repair or there are some temporary files preventing the repair script from successfully completing.

1. Run the repair script as root and post the full output from the thread to review the errors.

2. PM the system profile. If the database is offloaded, please retrieve the log from the remote server.

3. Please post the full output of the following command to check the table sizes.

- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
Best Regards,
Benjamin

Re: Nagios keep failing and requesting DB repair every 20-30

Posted: Wed Jan 27, 2021 9:20 am
by dlukinski
benjaminsmith wrote:Hi @dlukinski,

There may be a table corrupted beyond repair or there are some temporary files preventing the repair script from successfully completing.

1. Run the repair script as root and post the full output from the thread to review the errors.

2. PM the system profile. If the database is offloaded, please retrieve the log from the remote server.

3. Please post the full output of the following command to check the table sizes.

- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
Best Regards,
Benjamin
Please see Repair and Tables output (Repair always completes, but comes back)

Code: Select all

login as: kcadmin
Keyboard-interactive authentication prompts from server:
| Password:
End of keyboard-interactive prompts from server
Last login: Wed Jan 27 14:13:01 2021 from pf1rfv2r.res.kcg.global
[kcadmin@eukc-nagxiprod01 ~]$ sudo /usr/local/nagiosxi/scripts/repair_databases.sh
[sudo] password for kcadmin:
DATABASE: nagios
TABLE:
/var/lib/mysql/nagios /home/kcadmin
- recovering (with sort) MyISAM-table 'nagios_acknowledgements'
Data records: 151
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_commands'
Data records: 137
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_commenthistory'
Data records: 23331
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_comments'
Data records: 3
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_configfiles'
Data records: 1
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_configfilevariables'
Data records: 131
- Fixing index 1

---------

- recovering (with sort) MyISAM-table 'nagios_conninfo'
Data records: 413
- Fixing index 1

---------

- recovering (with sort) MyISAM-table 'nagios_contact_addresses'
Data records: 70
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_contactgroup_members'
Data records: 156
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_contactgroups'
Data records: 17
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_contact_notificationcommands'
Data records: 136
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_contactnotificationmethods'
Data records: 8393
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- recovering (with sort) MyISAM-table 'nagios_contactnotifications'
Data records: 8393
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4

---------

- recovering (with sort) MyISAM-table 'nagios_contacts'
Data records: 68
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_contactstatus'
Data records: 68
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_customvariables'
Data records: 202
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- recovering (with sort) MyISAM-table 'nagios_customvariablestatus'
Data records: 202
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- recovering (with keycache) MyISAM-table 'nagios_dbversion'
Data records: 1

---------

- recovering (with sort) MyISAM-table 'nagios_downtimehistory'
Data records: 9326
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_eventhandlers'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_externalcommands'
Data records: 7
- Fixing index 1

---------

- recovering (with sort) MyISAM-table 'nagios_flappinghistory'
Data records: 8110
- Fixing index 1

---------

- recovering (with sort) MyISAM-table 'nagios_hostchecks'
Data records: 32
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_host_contactgroups'
Data records: 23
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_host_contacts'
Data records: 7
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_hostdependencies'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_hostescalation_contactgroups'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_hostescalation_contacts'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_hostescalations'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_hostgroup_members'
Data records: 18
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_hostgroups'
Data records: 10
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_host_parenthosts'
Data records: 1
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_hosts'
Data records: 23
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- recovering (with sort) MyISAM-table 'nagios_hoststatus'
Data records: 23
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4
- Fixing index 5
- Fixing index 6
- Fixing index 7
- Fixing index 8
- Fixing index 9
- Fixing index 10
- Fixing index 11
- Fixing index 12
- Fixing index 13
- Fixing index 14
- Fixing index 15
- Fixing index 16
- Fixing index 17
- Fixing index 18
- Fixing index 19

---------

- recovering (with sort) MyISAM-table 'nagios_instances'
Data records: 1
- Fixing index 1

---------

- recovering (with sort) MyISAM-table 'nagios_logentries'
Data records: 197851
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4

---------

- recovering (with sort) MyISAM-table 'nagios_notifications'
Data records: 7436
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4

---------

- recovering (with sort) MyISAM-table 'nagios_objects'
Data records: 1009
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4
- Fixing index 5

---------

- recovering (with sort) MyISAM-table 'nagios_processevents'
Data records: 708
- Fixing index 1

---------

- recovering (with sort) MyISAM-table 'nagios_programstatus'
Data records: 1
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_runtimevariables'
Data records: 17
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_scheduleddowntime'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_servicechecks'
Data records: 145
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4

---------

- recovering (with sort) MyISAM-table 'nagios_service_contactgroups'
Data records: 185
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_service_contacts'
Data records: 100
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_servicedependencies'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_serviceescalation_contactgroups'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_serviceescalation_contacts'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_serviceescalations'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_servicegroup_members'
Data records: 308
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_servicegroups'
Data records: 59
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_service_parentservices'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_services'
Data records: 191
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- recovering (with sort) MyISAM-table 'nagios_servicestatus'
Data records: 191
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4
- Fixing index 5
- Fixing index 6
- Fixing index 7
- Fixing index 8
- Fixing index 9
- Fixing index 10
- Fixing index 11
- Fixing index 12
- Fixing index 13
- Fixing index 14
- Fixing index 15
- Fixing index 16
- Fixing index 17
- Fixing index 18
- Fixing index 19

---------

- recovering (with sort) MyISAM-table 'nagios_statehistory'
Data records: 954605
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- recovering (with sort) MyISAM-table 'nagios_systemcommands'
Data records: 60
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- recovering (with sort) MyISAM-table 'nagios_timedeventqueue'
Data records: 0
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4
- Fixing index 5
- Fixing index 6

---------

- recovering (with sort) MyISAM-table 'nagios_timedevents'
Data records: 0
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4
- Fixing index 5
- Fixing index 6

---------

- recovering (with sort) MyISAM-table 'nagios_timeperiods'
Data records: 75
- Fixing index 1
- Fixing index 2

---------

- recovering (with sort) MyISAM-table 'nagios_timeperiod_timeranges'
Data records: 502
- Fixing index 1
- Fixing index 2
/home/kcadmin

===============
REPAIR COMPLETE
===============
DATABASE: nagiosql
TABLE:
/var/lib/mysql/nagiosql /home/kcadmin
DATABASE: nagiosxi
TABLE:
/var/lib/mysql/nagiosxi /home/kcadmin

=======================
nagios database repair succeeded

[kcadmin@eukc-nagxiprod01 ~]$ sudo echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
+--------------------------------------------+------------+
| Table                                      | Size in MB |
+--------------------------------------------+------------+
| nagios_acknowledgements                    |       0.02 |
| nagios_commands                            |       0.02 |
| nagios_commenthistory                      |       6.51 |
| nagios_comments                            |       0.00 |
| nagios_configfiles                         |       0.01 |
| nagios_configfilevariables                 |       0.01 |
| nagios_conninfo                            |       0.05 |
| nagios_contact_addresses                   |       0.00 |
| nagios_contact_notificationcommands        |       0.01 |
| nagios_contactgroup_members                |       0.01 |
| nagios_contactgroups                       |       0.00 |
| nagios_contactnotificationmethods          |       0.72 |
| nagios_contactnotifications                |       0.77 |
| nagios_contacts                            |       0.01 |
| nagios_contactstatus                       |       0.01 |
| nagios_customvariables                     |       0.02 |
| nagios_customvariablestatus                |       0.02 |
| nagios_dbversion                           |       0.00 |
| nagios_downtimehistory                     |       1.38 |
| nagios_eventhandlers                       |       0.00 |
| nagios_externalcommands                    |       0.00 |
| nagios_flappinghistory                     |       0.58 |
| nagios_host_contactgroups                  |       0.00 |
| nagios_host_contacts                       |       0.00 |
| nagios_host_parenthosts                    |       0.00 |
| nagios_hostchecks                          |       0.01 |
| nagios_hostdependencies                    |       0.00 |
| nagios_hostescalation_contactgroups        |       0.00 |
| nagios_hostescalation_contacts             |       0.00 |
| nagios_hostescalations                     |       0.00 |
| nagios_hostgroup_members                   |       0.00 |
| nagios_hostgroups                          |       0.00 |
| nagios_hosts                               |       0.01 |
| nagios_hoststatus                          |       0.03 |
| nagios_instances                           |       0.00 |
| nagios_logentries                          |      48.20 |
| nagios_notifications                       |       1.42 |
| nagios_objects                             |       0.16 |
| nagios_processevents                       |       0.04 |
| nagios_programstatus                       |       0.00 |
| nagios_runtimevariables                    |       0.00 |
| nagios_scheduleddowntime                   |       0.00 |
| nagios_service_contactgroups               |       0.01 |
| nagios_service_contacts                    |       0.01 |
| nagios_service_parentservices              |       0.00 |
| nagios_servicechecks                       |       0.04 |
| nagios_servicedependencies                 |       0.00 |
| nagios_serviceescalation_contactgroups     |       0.00 |
| nagios_serviceescalation_contacts          |       0.00 |
| nagios_serviceescalations                  |       0.00 |
| nagios_servicegroup_members                |       0.02 |
| nagios_servicegroups                       |       0.01 |
| nagios_services                            |       0.05 |
| nagios_servicestatus                       |       0.13 |
| nagios_statehistory                        |     108.78 |
| nagios_systemcommands                      |       0.01 |
| nagios_timedeventqueue                     |       0.00 |
| nagios_timedevents                         |       0.00 |
| nagios_timeperiod_timeranges               |       0.03 |
| nagios_timeperiods                         |       0.01 |
| tbl_command                                |       0.06 |
| tbl_contact                                |       0.03 |
| tbl_contactgroup                           |       0.03 |
| tbl_contacttemplate                        |       0.03 |
| tbl_domain                                 |       0.03 |
| tbl_host                                   |       0.03 |
| tbl_hostdependency                         |       0.03 |
| tbl_hostescalation                         |       0.03 |
| tbl_hostextinfo                            |       0.03 |
| tbl_hostgroup                              |       0.03 |
| tbl_hosttemplate                           |       0.03 |
| tbl_info                                   |       0.17 |
| tbl_lnkContactToCommandHost                |       0.02 |
| tbl_lnkContactToCommandService             |       0.02 |
| tbl_lnkContactToContactgroup               |       0.02 |
| tbl_lnkContactToContacttemplate            |       0.02 |
| tbl_lnkContactToVariabledefinition         |       0.02 |
| tbl_lnkContactgroupToContact               |       0.02 |
| tbl_lnkContactgroupToContactgroup          |       0.02 |
| tbl_lnkContacttemplateToCommandHost        |       0.02 |
| tbl_lnkContacttemplateToCommandService     |       0.02 |
| tbl_lnkContacttemplateToContactgroup       |       0.02 |
| tbl_lnkContacttemplateToContacttemplate    |       0.02 |
| tbl_lnkContacttemplateToVariabledefinition |       0.02 |
| tbl_lnkHostToContact                       |       0.02 |
| tbl_lnkHostToContactgroup                  |       0.02 |
| tbl_lnkHostToHost                          |       0.02 |
| tbl_lnkHostToHostgroup                     |       0.02 |
| tbl_lnkHostToHosttemplate                  |       0.02 |
| tbl_lnkHostToVariabledefinition            |       0.02 |
| tbl_lnkHostdependencyToHost_DH             |       0.02 |
| tbl_lnkHostdependencyToHost_H              |       0.02 |
| tbl_lnkHostdependencyToHostgroup_DH        |       0.02 |
| tbl_lnkHostdependencyToHostgroup_H         |       0.02 |
| tbl_lnkHostescalationToContact             |       0.02 |
| tbl_lnkHostescalationToContactgroup        |       0.02 |
| tbl_lnkHostescalationToHost                |       0.02 |
| tbl_lnkHostescalationToHostgroup           |       0.02 |
| tbl_lnkHostgroupToHost                     |       0.02 |
| tbl_lnkHostgroupToHostgroup                |       0.02 |
| tbl_lnkHosttemplateToContact               |       0.02 |
| tbl_lnkHosttemplateToContactgroup          |       0.02 |
| tbl_lnkHosttemplateToHost                  |       0.02 |
| tbl_lnkHosttemplateToHostgroup             |       0.02 |
| tbl_lnkHosttemplateToHosttemplate          |       0.02 |
| tbl_lnkHosttemplateToVariabledefinition    |       0.02 |
| tbl_lnkServiceToContact                    |       0.02 |
| tbl_lnkServiceToContactgroup               |       0.02 |
| tbl_lnkServiceToHost                       |       0.02 |
| tbl_lnkServiceToHostgroup                  |       0.02 |
| tbl_lnkServiceToServicegroup               |       0.02 |
| tbl_lnkServiceToServicetemplate            |       0.02 |
| tbl_lnkServiceToVariabledefinition         |       0.02 |
| tbl_lnkServicedependencyToHost_DH          |       0.02 |
| tbl_lnkServicedependencyToHost_H           |       0.02 |
| tbl_lnkServicedependencyToHostgroup_DH     |       0.02 |
| tbl_lnkServicedependencyToHostgroup_H      |       0.02 |
| tbl_lnkServicedependencyToService_DS       |       0.02 |
| tbl_lnkServicedependencyToService_S        |       0.02 |
| tbl_lnkServicedependencyToServicegroup_DS  |       0.02 |
| tbl_lnkServicedependencyToServicegroup_S   |       0.02 |
| tbl_lnkServiceescalationToContact          |       0.02 |
| tbl_lnkServiceescalationToContactgroup     |       0.02 |
| tbl_lnkServiceescalationToHost             |       0.02 |
| tbl_lnkServiceescalationToHostgroup        |       0.02 |
| tbl_lnkServiceescalationToService          |       0.02 |
| tbl_lnkServiceescalationToServicegroup     |       0.02 |
| tbl_lnkServicegroupToService               |       0.02 |
| tbl_lnkServicegroupToServicegroup          |       0.02 |
| tbl_lnkServicetemplateToContact            |       0.02 |
| tbl_lnkServicetemplateToContactgroup       |       0.02 |
| tbl_lnkServicetemplateToHost               |       0.02 |
| tbl_lnkServicetemplateToHostgroup          |       0.02 |
| tbl_lnkServicetemplateToServicegroup       |       0.02 |
| tbl_lnkServicetemplateToServicetemplate    |       0.02 |
| tbl_lnkServicetemplateToVariabledefinition |       0.02 |
| tbl_lnkTimeperiodToTimeperiod              |       0.02 |
| tbl_logbook                                |       0.02 |
| tbl_mainmenu                               |       0.02 |
| tbl_permission                             |       0.02 |
| tbl_permission_inactive                    |       0.02 |
| tbl_service                                |       0.08 |
| tbl_servicedependency                      |       0.03 |
| tbl_serviceescalation                      |       0.03 |
| tbl_serviceextinfo                         |       0.03 |
| tbl_servicegroup                           |       0.03 |
| tbl_servicetemplate                        |       0.03 |
| tbl_session                                |       0.02 |
| tbl_session_locks                          |       0.02 |
| tbl_settings                               |       0.03 |
| tbl_submenu                                |       0.02 |
| tbl_timedefinition                         |       0.06 |
| tbl_timeperiod                             |       0.03 |
| tbl_user                                   |       0.03 |
| tbl_variabledefinition                     |       0.02 |
| xi_auditlog                                |    3319.09 |
| xi_auth_tokens                             |       0.03 |
| xi_cmp_ccm_backups                         |       0.02 |
| xi_cmp_favorites                           |       0.03 |
| xi_cmp_nagiosbpi_backups                   |       0.02 |
| xi_cmp_trapdata                            |       0.03 |
| xi_cmp_trapdata_log                        |       0.03 |
| xi_commands                                |       0.02 |
| xi_deploy_agents                           |       0.02 |
| xi_deploy_jobs                             |       0.02 |
| xi_eventqueue                              |       0.03 |
| xi_events                                  |       0.08 |
| xi_incidents                               |       0.02 |
| xi_meta                                    |       1.34 |
| xi_mibs                                    |       0.05 |
| xi_options                                 |       0.03 |
| xi_sessions                                |       0.03 |
| xi_sysstat                                 |       0.03 |
| xi_usermeta                                |       0.84 |
| xi_users                                   |       0.06 |
+--------------------------------------------+------------+
[kcadmin@eukc-nagxiprod01 ~]$
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.

Re: Nagios keep failing and requesting DB repair every 20-30

Posted: Thu Jan 28, 2021 2:40 pm
by benjaminsmith
Hi,

Thanks for the profile and I have few suggestions to help resolve this.

You currently on 5.7.5, and we have released 5.8.8.1 which has some important performance updates. Please try to bring this system up to the latest version and let me know if you notice an improvement.

The actual database log looks normal, but there are some connection issues in the Apache logs.

Code: Select all

  <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
And there are some errors with the new version of ndo in the nagios.log

Code: Select all

 NDO-3: mysql_ping: Unknown error. Is the database running?
[1611756826] NDO-3: ndo_get_object_id_name2(ndo.c:1283): Could not reconnect to MySQL database
I also noticed the xi_auditlog table is quite large and I would reccomend decreasing the the amount of historical data that is kept. You can do this by going to Admin > System Config > Performance Settings > XI Databse > Max Audit Log Age.

You can also truncate the table. However, this will clear out any historical data.

Code: Select all

echo 'TRUNCATE TABLE xi_auditlog;' | mysql -u root -pnagiosxi nagiosxi
Please try to make those changes and if this issue persists let's downgrade to the previous version of ndo (instructions below).

Code: Select all

systemctl stop nagios
cd /tmp
rm -rf /tmp/nagiosxi
wget https://assets.nagios.com/downloads/nagiosxi/5/xi-5.6.14.tar.gz
tar zxf xi-5.6.14.tar.gz
cd /tmp/nagiosxi/subcomponents/ndoutils
./install
systemctl enable ndo2db
Then edit your /usr/local/nagios/etc/nagios.cfg and make sure this line is uncommented:

Code: Select all

broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
Make sure this line is commented:

Code: Select all

#broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
Then start the nagios service:

Code: Select all

systemctl start nagios

Re: Nagios keep failing and requesting DB repair every 20-30

Posted: Fri Jan 29, 2021 3:29 pm
by dlukinski
benjaminsmith wrote:Hi,

Thanks for the profile and I have few suggestions to help resolve this.

You currently on 5.7.5, and we have released 5.8.8.1 which has some important performance updates. Please try to bring this system up to the latest version and let me know if you notice an improvement.

The actual database log looks normal, but there are some connection issues in the Apache logs.

Code: Select all

  <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
And there are some errors with the new version of ndo in the nagios.log

Code: Select all

 NDO-3: mysql_ping: Unknown error. Is the database running?
[1611756826] NDO-3: ndo_get_object_id_name2(ndo.c:1283): Could not reconnect to MySQL database
I also noticed the xi_auditlog table is quite large and I would reccomend decreasing the the amount of historical data that is kept. You can do this by going to Admin > System Config > Performance Settings > XI Databse > Max Audit Log Age.

You can also truncate the table. However, this will clear out any historical data.

Code: Select all

echo 'TRUNCATE TABLE xi_auditlog;' | mysql -u root -pnagiosxi nagiosxi
Please try to make those changes and if this issue persists let's downgrade to the previous version of ndo (instructions below).

Code: Select all

systemctl stop nagios
cd /tmp
rm -rf /tmp/nagiosxi
wget https://assets.nagios.com/downloads/nagiosxi/5/xi-5.6.14.tar.gz
tar zxf xi-5.6.14.tar.gz
cd /tmp/nagiosxi/subcomponents/ndoutils
./install
systemctl enable ndo2db
Then edit your /usr/local/nagios/etc/nagios.cfg and make sure this line is uncommented:

Code: Select all

broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
Make sure this line is commented:

Code: Select all

#broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
Then start the nagios service:

Code: Select all

systemctl start nagios

Hello
Nagios monitoring refuses to start after NDO changes
This is critical to us

What to do?

Re: Nagios keep failing and requesting DB repair every 20-30

Posted: Fri Jan 29, 2021 4:03 pm
by dlukinski
dlukinski wrote:
benjaminsmith wrote:Hi,

Thanks for the profile and I have few suggestions to help resolve this.

You currently on 5.7.5, and we have released 5.8.8.1 which has some important performance updates. Please try to bring this system up to the latest version and let me know if you notice an improvement.

The actual database log looks normal, but there are some connection issues in the Apache logs.

Code: Select all

  <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
And there are some errors with the new version of ndo in the nagios.log

Code: Select all

 NDO-3: mysql_ping: Unknown error. Is the database running?
[1611756826] NDO-3: ndo_get_object_id_name2(ndo.c:1283): Could not reconnect to MySQL database
I also noticed the xi_auditlog table is quite large and I would reccomend decreasing the the amount of historical data that is kept. You can do this by going to Admin > System Config > Performance Settings > XI Databse > Max Audit Log Age.

You can also truncate the table. However, this will clear out any historical data.

Code: Select all

echo 'TRUNCATE TABLE xi_auditlog;' | mysql -u root -pnagiosxi nagiosxi
Please try to make those changes and if this issue persists let's downgrade to the previous version of ndo (instructions below).

Code: Select all

systemctl stop nagios
cd /tmp
rm -rf /tmp/nagiosxi
wget https://assets.nagios.com/downloads/nagiosxi/5/xi-5.6.14.tar.gz
tar zxf xi-5.6.14.tar.gz
cd /tmp/nagiosxi/subcomponents/ndoutils
./install
systemctl enable ndo2db
Then edit your /usr/local/nagios/etc/nagios.cfg and make sure this line is uncommented:

Code: Select all

broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
Make sure this line is commented:

Code: Select all

#broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
Then start the nagios service:

Code: Select all

systemctl start nagios

Hello
Nagios monitoring refuses to start after NDO changes
This is critical to us

What to do?
Ended up upgrading ndo to 5.8 instead. The monitoring engine is running so far

Re: Nagios keep failing and requesting DB repair every 20-30

Posted: Fri Jan 29, 2021 5:16 pm
by benjaminsmith
Hi,
Ended up upgrading ndo to 5.8 instead. The monitoring engine is running so far
That's good to hear. We'll keep this open and just update us if you have any issues.

Have a good weekend!

--Benjamin

Re: Nagios keep failing and requesting DB repair every 20-30

Posted: Mon Feb 08, 2021 6:05 pm
by dlukinski
benjaminsmith wrote:Hi,
Ended up upgrading ndo to 5.8 instead. The monitoring engine is running so far
That's good to hear. We'll keep this open and just update us if you have any issues.

Have a good weekend!

--Benjamin
Thank you. I think we could close it now.

So again, we had 5.7.5 NDO troubles. Downgrading to NDO from 5.6.14 broke monitoring engine.
Upgrading to NDO from 5.8 (while keeping 5.7.5 XI version) worked. - Wish I knew why.

Re: Nagios keep failing and requesting DB repair every 20-30

Posted: Tue Feb 09, 2021 3:59 pm
by benjaminsmith
Hi,
Thank you. I think we could close it now.
Thanks for the update on this.

Not 100% sure why, but If you have any further issues with this, feel free to open a new ticket or post.