Page 2 of 2
Re: NDO-3 problem
Posted: Thu Oct 14, 2021 1:08 pm
by cbeattie-unitrends
That seems to have done the trick. I haven't seen any complaints from NDO-3 in nagios.log since recreating the nagios_notifications table.
Code: Select all
[root@********~]# mysql -h ********-uroot -p'********' -e "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | grep -E 'nagios_notifications|xi_(event|meta)'
nagios_notifications 0.00
xi_eventqueue 0.03
xi_events 22.55
xi_meta 362.83
[root@********~]# mysql -h ******** -uroot -p'********' -e "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | grep -E 'nagios_notifications|xi_(event|meta)'
nagios_notifications 0.42
xi_eventqueue 0.03
xi_events 17.55
xi_meta 290.77
Additionally, the temp tables appear to have been cleaned up between runs. I think that's got my problem sorted out.
Thank you.
Re: NDO-3 problem
Posted: Thu Oct 14, 2021 4:41 pm
by ssax
That's great to hear! Keep an eye on it and then let us know when we're okay to lock this up and mark it as resolved.
Thank you!
Re: NDO-3 problem
Posted: Mon Oct 25, 2021 8:30 am
by cbeattie-unitrends
Hello,
It looks like another one of the database tables has disappeared. Over the weekend, the Nagios instance with the external database (the same one we just fixed) stopped. I looked at the log file and it appears to have lost connection to the external database:
Code: Select all
[1635128468] GLOBAL SERVICE EVENT HANDLER: *****;procs;WARNING;HARD;1;xi_service_event_handler
[1635128469] wproc: GLOBAL SERVICE EVENTHANDLER job 70290 from worker Core Worker 56661 is a non-check helper but exited with return code 1
[1635128469] wproc: early_timeout=0; exited_ok=1; wait_status=256; error_code=0;
[1635128469] wproc: stdout line 01: UNABLE TO CONNECT TO DB - EXITING!
[1635128469] wproc: GLOBAL SERVICE EVENTHANDLER job 70290 from worker Core Worker 56662 is a non-check helper but exited with return code 1
[1635128469] wproc: early_timeout=0; exited_ok=1; wait_status=256; error_code=0;
[1635128469] wproc: stdout line 01: UNABLE TO CONNECT TO DB - EXITING!
[1635128469] NDO-3: Unable to prepare statement for query (3): Lost connection to MySQL server during query
[1635128469] NDO-3: Unable to prepare statement for query (4): MySQL server has gone away
[1635128469] NDO-3: Unable to prepare statement for query (5): MySQL server has gone away
...
[1635128469] NDO-3: Unable to prepare statement for query (37): MySQL server has gone away
[1635128469] NDO-3: Error preparing statements
[1635128469] Caught SIGSEGV, shutting down...
Attempting to restart Nagios while tailing the log file showed this:
Code: Select all
[1635167038] NDO-3: Started statechange thread
[1635167038] NDO-3: Started notification thread
[1635167038] NDO-3: Unable to prepare statement for query (5): Table 'nagios.nagios_logentries' doesn't exist
[1635167038] NDO-3: Unable to prepare statement for query (5): Table 'nagios.nagios_logentries' doesn't exist
[1635167038] NDO-3: Unable to prepare statement for query (5): Table 'nagios.nagios_logentries' doesn't exist
[1635167038] NDO-3: Unable to prepare statement for query (5): Table 'nagios.nagios_logentries' doesn't exist
[1635167038] NDO-3: Unable to prepare statement for query (5): Table 'nagios.nagios_logentries' doesn't exist
I confirmed that the nagios_logentries table was indeed missing:
Code: Select all
# mysql -h *****-u***** -p'*****' ' -e "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | grep -E 'nagios_(logentries|notifications)|xi_(event|meta)'
nagios_notifications 369.86
xi_eventqueue 0.03
xi_events 224.77
xi_meta 3794.98
On top of that, I'm concerned that the xi_events and xi_meta tables aren't being trimmed properly still.
Thanks.
Re: NDO-3 problem
Posted: Mon Oct 25, 2021 2:58 pm
by ssax
Please create a ticket for this and include a link back to this forum thread so we can get a remote session setup:
https://support.nagios.com/tickets/
Attach a fresh copy of your profile.zip as well in that new ticket as well.
EDIT: I would check your DB server for filesystem issues, the tables shouldn't go missing randomly unless there are memory/disk issues on the DB server.
Thank you!