Page 1 of 1
Nagios XI Crash multiple time per week
Posted: Fri Apr 09, 2021 6:55 am
by bennyboy
Hi, We have to run /usr/local/nagiosxi/scripts/repair_databases.sh multiple time per week. Can you help us to find why and fix it please.
Where I can send the profile information in private mode ?
Thx!
Re: Nagios XI Crash multiple time per week
Posted: Fri Apr 09, 2021 8:34 am
by bennyboy
I will update our instance to the latest version and see.
Re: Nagios XI Crash multiple time per week
Posted: Fri Apr 09, 2021 8:43 am
by bennyboy
Re: Nagios XI Crash multiple time per week
Posted: Fri Apr 09, 2021 2:46 pm
by vtrac
Hi bennyboy,
Yes, you can follow the steps in the KB article to move all your database storage engine to InnoDB:
https://support.nagios.com/kb/article/d ... i-896.html
There is benefit of using InnoDB:
https://dev.mysql.com/doc/refman/5.7/en ... efits.html
You are free to upgrade your Nagios XI to the latest released version, instruction below:
https://assets.nagios.com/downloads/nag ... ctions.pdf
Please run the below command and update outputs to this post:
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
Please upload or PM me the "profile.zip".
I'm not going to ask you to run the "/usr/local/nagiosxi/scripts/repair_databases.sh" script since you said you have ran it couple times a week. However, please upload the outputs of that command if you happen to have its.
Best Regards,
Vinh
Re: Nagios XI Crash multiple time per week
Posted: Sat Apr 10, 2021 9:01 am
by bennyboy
vtrac wrote:Please run the below command and update outputs to this post:
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
This is the output of the select you asked :
Code: Select all
+--------------------------------------------+------------+
| Table | Size in MB |
+--------------------------------------------+------------+
| nagios_acknowledgements | 2.91 |
| nagios_commands | 0.06 |
| nagios_commenthistory | 4692.00 |
| nagios_comments | 1.61 |
| nagios_configfiles | 0.03 |
| nagios_configfilevariables | 0.02 |
| nagios_conninfo | 1.52 |
| nagios_contact_addresses | 0.03 |
| nagios_contact_notificationcommands | 0.11 |
| nagios_contactgroup_members | 0.03 |
| nagios_contactgroups | 0.03 |
| nagios_contactnotificationmethods | 6.55 |
| nagios_contactnotifications | 9.06 |
| nagios_contacts | 0.03 |
| nagios_contactstatus | 0.03 |
| nagios_customvariables | 4.55 |
| nagios_customvariablestatus | 5.55 |
| nagios_dbversion | 0.02 |
| nagios_downtimehistory | 225.31 |
| nagios_eventhandlers | 0.06 |
| nagios_externalcommands | 3.52 |
| nagios_flappinghistory | 13.52 |
| nagios_host_contactgroups | 0.64 |
| nagios_host_contacts | 0.30 |
| nagios_host_parenthosts | 0.19 |
| nagios_hostchecks | 0.03 |
| nagios_hostdependencies | 0.03 |
| nagios_hostescalation_contactgroups | 0.03 |
| nagios_hostescalation_contacts | 0.03 |
| nagios_hostescalations | 0.03 |
| nagios_hostgroup_members | 0.50 |
| nagios_hostgroups | 0.09 |
| nagios_hosts | 1.77 |
| nagios_hoststatus | 4.64 |
| nagios_instances | 0.02 |
| nagios_logentries | 1768.48 |
| nagios_notifications | 9.02 |
| nagios_objects | 11.58 |
| nagios_processevents | 1.52 |
| nagios_programstatus | 0.03 |
| nagios_runtimevariables | 0.03 |
| nagios_scheduleddowntime | 0.56 |
| nagios_service_contactgroups | 3.03 |
| nagios_service_contacts | 1.86 |
| nagios_service_parentservices | 0.03 |
| nagios_servicechecks | 0.06 |
| nagios_servicedependencies | 0.03 |
| nagios_serviceescalation_contactgroups | 0.03 |
| nagios_serviceescalation_contacts | 0.03 |
| nagios_serviceescalations | 0.03 |
| nagios_servicegroup_members | 0.25 |
| nagios_servicegroups | 0.03 |
| nagios_services | 6.41 |
| nagios_servicestatus | 17.03 |
| nagios_statehistory | 1687.42 |
| nagios_systemcommands | 0.16 |
| nagios_timedeventqueue | 0.09 |
| nagios_timedevents | 0.09 |
| nagios_timeperiod_timeranges | 0.03 |
| nagios_timeperiods | 0.03 |
| tbl_command | 0.08 |
| tbl_contact | 0.03 |
| tbl_contactgroup | 0.03 |
| tbl_contacttemplate | 0.03 |
| tbl_domain | 0.03 |
| tbl_host | 1.73 |
| tbl_hostdependency | 0.03 |
| tbl_hostescalation | 0.03 |
| tbl_hostextinfo | 0.03 |
| tbl_hostgroup | 0.11 |
| tbl_hosttemplate | 0.03 |
| tbl_info | 0.17 |
| tbl_lnkContactToCommandHost | 0.02 |
| tbl_lnkContactToCommandService | 0.02 |
| tbl_lnkContactToContactgroup | 0.02 |
| tbl_lnkContactToContacttemplate | 0.02 |
| tbl_lnkContactToVariabledefinition | 0.02 |
| tbl_lnkContactgroupToContact | 0.02 |
| tbl_lnkContactgroupToContactgroup | 0.02 |
| tbl_lnkContacttemplateToCommandHost | 0.02 |
| tbl_lnkContacttemplateToCommandService | 0.02 |
| tbl_lnkContacttemplateToContactgroup | 0.02 |
| tbl_lnkContacttemplateToContacttemplate | 0.02 |
| tbl_lnkContacttemplateToVariabledefinition | 0.02 |
| tbl_lnkHostToContact | 0.02 |
| tbl_lnkHostToContactgroup | 0.02 |
| tbl_lnkHostToHost | 0.14 |
| tbl_lnkHostToHostgroup | 0.19 |
| tbl_lnkHostToHosttemplate | 0.44 |
| tbl_lnkHostToVariabledefinition | 0.02 |
| tbl_lnkHostdependencyToHost_DH | 0.02 |
| tbl_lnkHostdependencyToHost_H | 0.02 |
| tbl_lnkHostdependencyToHostgroup_DH | 0.02 |
| tbl_lnkHostdependencyToHostgroup_H | 0.02 |
| tbl_lnkHostescalationToContact | 0.02 |
| tbl_lnkHostescalationToContactgroup | 0.02 |
| tbl_lnkHostescalationToHost | 0.02 |
| tbl_lnkHostescalationToHostgroup | 0.02 |
| tbl_lnkHostgroupToHost | 0.17 |
| tbl_lnkHostgroupToHostgroup | 0.02 |
| tbl_lnkHosttemplateToContact | 0.02 |
| tbl_lnkHosttemplateToContactgroup | 0.02 |
| tbl_lnkHosttemplateToHost | 0.02 |
| tbl_lnkHosttemplateToHostgroup | 0.02 |
| tbl_lnkHosttemplateToHosttemplate | 0.02 |
| tbl_lnkHosttemplateToVariabledefinition | 0.02 |
| tbl_lnkServiceToContact | 0.02 |
| tbl_lnkServiceToContactgroup | 0.05 |
| tbl_lnkServiceToHost | 0.52 |
| tbl_lnkServiceToHostgroup | 0.02 |
| tbl_lnkServiceToServicegroup | 0.02 |
| tbl_lnkServiceToServicetemplate | 1.52 |
| tbl_lnkServiceToVariabledefinition | 0.02 |
| tbl_lnkServicedependencyToHost_DH | 0.02 |
| tbl_lnkServicedependencyToHost_H | 0.02 |
| tbl_lnkServicedependencyToHostgroup_DH | 0.02 |
| tbl_lnkServicedependencyToHostgroup_H | 0.02 |
| tbl_lnkServicedependencyToService_DS | 0.02 |
| tbl_lnkServicedependencyToService_S | 0.02 |
| tbl_lnkServicedependencyToServicegroup_DS | 0.02 |
| tbl_lnkServicedependencyToServicegroup_S | 0.02 |
| tbl_lnkServiceescalationToContact | 0.02 |
| tbl_lnkServiceescalationToContactgroup | 0.02 |
| tbl_lnkServiceescalationToHost | 0.02 |
| tbl_lnkServiceescalationToHostgroup | 0.02 |
| tbl_lnkServiceescalationToService | 0.02 |
| tbl_lnkServiceescalationToServicegroup | 0.02 |
| tbl_lnkServicegroupToService | 0.02 |
| tbl_lnkServicegroupToServicegroup | 0.02 |
| tbl_lnkServicetemplateToContact | 0.02 |
| tbl_lnkServicetemplateToContactgroup | 0.02 |
| tbl_lnkServicetemplateToHost | 0.02 |
| tbl_lnkServicetemplateToHostgroup | 0.02 |
| tbl_lnkServicetemplateToServicegroup | 0.02 |
| tbl_lnkServicetemplateToServicetemplate | 0.02 |
| tbl_lnkServicetemplateToVariabledefinition | 0.02 |
| tbl_lnkTimeperiodToTimeperiod | 0.02 |
| tbl_logbook | 0.02 |
| tbl_mainmenu | 0.02 |
| tbl_permission | 0.02 |
| tbl_permission_inactive | 0.02 |
| tbl_service | 3.52 |
| tbl_servicedependency | 0.03 |
| tbl_serviceescalation | 0.03 |
| tbl_serviceextinfo | 0.03 |
| tbl_servicegroup | 0.03 |
| tbl_servicetemplate | 0.13 |
| tbl_session | 0.02 |
| tbl_session_locks | 0.02 |
| tbl_settings | 0.03 |
| tbl_submenu | 0.02 |
| tbl_timedefinition | 0.02 |
| tbl_timeperiod | 0.03 |
| tbl_user | 0.03 |
| tbl_variabledefinition | 0.08 |
| xi_auditlog | 2.50 |
| xi_auth_tokens | 0.11 |
| xi_cmp_trapdata | 0.03 |
| xi_cmp_trapdata_log | 0.03 |
| xi_commands | 0.27 |
| xi_eventqueue | 0.03 |
| xi_events | 1188.53 |
| xi_incidents | 0.02 |
| xi_meta | 22259.98 |
| xi_mibs | 0.05 |
| xi_options | 0.08 |
| xi_sessions | 0.22 |
| xi_sysstat | 0.03 |
| xi_usermeta | 4.92 |
| xi_users | 0.06 |
+--------------------------------------------+------------+
vtrac wrote:
Please upload or PM me the "profile.zip".
I also upload the profile file via PM functionality.
Re: Nagios XI Crash multiple time per week
Posted: Sat Apr 10, 2021 9:07 am
by bennyboy
I apply the convert to all Nagios Table to innodb and this morning the DB don't corrupted. But we experienced a stuck nagiosxi and I found that the script that backup the DB
Code: Select all
# Backup MySQL & PostgreSQL Databases
0 7 * * * root /root/scripts/automysqlbackup
Have those option :
Code: Select all
OPT="--quote-names --opt" # OPT string for use with mysqldump ( see man mysqldump )
I found in mysqldump manual that the option
will lock the table.
So I read the man and I found that I can use
I decide to run a test with that option instead of
Those are the option I add in the script
Code: Select all
--single-transaction --
skip-lock-tables --add-drop-table --add-locks --create-options --disable-keys --extended-insert --quick --set-charset --quote-names
I will update after the test...
Re: Nagios XI Crash multiple time per week
Posted: Sat Apr 10, 2021 9:53 am
by bennyboy
So during the backup we are now able to access nagios xi.
Code: Select all
root 28891 16.5 0.0 126916 3476 pts/0 D+ 10:39 1:55 mysqldump --user=root --password=x xxxxxxxxxxxxxxxxxxxxxxx --host=localhost --single-transaction --
skip-lock-tables --add-drop-table --add-locks --create-options --disable-keys --extended-insert --quick --set-charset --quote-names --databases nagiosxi
Last time I run the script with
NagiosXi was stuck.
I also change the backup time from 7ham to 2ham.Less traffic.
Do you have any other advice before we do the upgrade next week ??
Thx!
Re: Nagios XI Crash multiple time per week
Posted: Mon Apr 12, 2021 1:32 pm
by vtrac
Hi,
I did not see the "profile.zip" in my inbox but looks like I don't need that file now since you have resolved the DB backup issue. I will take note of the OPT (options) you used for future supports.
Wonderful job!! ...
If this is a VM, I would recommend that you shut down the instance and take a GOOD snapshot of the VM before doing the upgrade.
If taking a snapshot is not an option, then I would recommend that you do a full backup first:
https://assets.nagios.com/downloads/nag ... ios-XI.pdf
Good luck with the upgrade!!
https://assets.nagios.com/downloads/nag ... ctions.pdf
Best Regards,
Vinh
Re: Nagios XI Crash multiple time per week
Posted: Tue Apr 13, 2021 10:08 am
by bennyboy
vtrac wrote:Hi,
I did not see the "profile.zip" in my inbox but looks like I don't need that file now since you have resolved the DB backup issue. I will take note of the OPT (options) you used for future supports.
Wonderful job!! ...
If this is a VM, I would recommend that you shut down the instance and take a GOOD snapshot of the VM before doing the upgrade.
If taking a snapshot is not an option, then I would recommend that you do a full backup first:
https://assets.nagios.com/downloads/nag ... ios-XI.pdf
Good luck with the upgrade!!
https://assets.nagios.com/downloads/nag ... ctions.pdf
Best Regards,
Vinh
I resend it. Can you confirm that you received it please.
Re: Nagios XI Crash multiple time per week
Posted: Tue Apr 13, 2021 10:45 am
by vtrac
Hi,
Yes, I did received it.
There is a "warning" in your database's log:
This is related to your changes.
Code: Select all
210409 11:57:20 [Warning] options --log-slow-admin-statements, --log-queries-not-using-indexes and --log-slow-slave-statements have no effect if --log_slow_queries is not set
I noticed you have some "passive" checks that has no service define:
You can go "Admin > Monitoring Config > Unconfigured Objects", then configure those.
Code: Select all
Apr 10 09:49:12 slpmon0034 nagios: Error: Got check result for service 'DBA_Alerte_passive_oracle' on host 'sxqgbd0798'. Unable to find service
Apr 10 09:49:48 slpmon0034 nagios: Error: Got check result for service 'passive-check-script' on host 'ctelpsa246'. Unable to find service
The "nagios.txt" log file is empty.
Best Regards,
Vinh