I don't
think I have any crashed tables. I looked at /var/log/mariadb/mariadb.log and although grepping for crash didn't return anything, the other contents weren't what I expected. The pattern seems to be that the warning gets repeated a bunch of times, then the monitor output.
Code: Select all
2021-10-12 13:47:54 7365508 [Warning] InnoDB: Over 67 percent of the buffer pool is occupied by lock heaps or the adaptive hash index! Check that your transactions do not set too many row locks. innodb_buffer_pool_size=128M. Starting the InnoDB Monitor to print diagnostics.
=====================================
2021-10-12 13:48:04 0x7fe2f6546700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 47 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 936964 srv_active, 0 srv_shutdown, 179 srv_idle
srv_master_thread log flush and writes: 937141
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 2011986257
OS WAIT ARRAY INFO: signal count 3328546024
RW-shared spins 24329468616, rounds 167038144395, OS waits 655528858
RW-excl spins 917673661, rounds 9405051947, OS waits 105404570
RW-sx spins 29169304, rounds 513076929, OS waits 8575783
Spin rounds per wait: 6.87 RW-shared, 10.25 RW-excl, 17.59 RW-sx
FAIL TO OBTAIN LOCK MUTEX, SKIP LOCK INFO PRINTING
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
55271980514 OS file reads, 1088887784 OS file writes, 341751569 OS fsyncs
1 pending reads, 0 pending writes
58595.94 reads/s, 16383 avg bytes/read, 648.88 writes/s, 345.04 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 8, free list len 2426, seg size 2435, 19612541 merges
merged operations:
insert 83963947, delete mark 299194323, delete 7328246
discarded operations:
insert 51392, delete mark 1879, delete 76
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 16 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 8 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 3 buffer(s)
322536.58 hash searches/s, 76575.65 non-hash searches/s
---
LOG
---
Log sequence number 23336846891287
Log flushed up to 23336846891287
Pages flushed up to 23336846546811
Last checkpoint at 23336844674792
0 pending log flushes, 0 pending chkp writes
230604756 log i/o's done, 288.43 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 170590208
Dictionary memory allocated 457776
Buffer pool size 8192
Free buffers 0
Database pages 2239
Old database pages 835
Modified db pages 134
Percent of dirty pages(LRU & free pages): 5.982
Max dirty pages percent: 75.000
Pending reads 2
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 9451354, not young 1264073857478
11.43 youngs/s, 2066569.39 non-youngs/s
Pages read 55271620385, created 468481754, written 806957859
58596.52 reads/s, 9.70 creates/s, 336.16 writes/s
Buffer pool hit rate 976 / 1000, young-making rate 0 / 1000 not 843 / 1000
Pages read ahead 920.92/s, evicted without access 857.71/s, Random read ahead 0.00/s
LRU len: 2239, unzip_LRU len: 0
I/O sum[4582394]:cur[7981], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
14 read views open inside InnoDB
Process ID=1434, Main thread ID=140612746884864, state: sleeping
Number of rows inserted 93580553, updated 99124466, deleted 78924030, read 1287409090672
132.12 inserts/s, 117.34 updates/s, 10.36 deletes/s, 2028622.73 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
Here's the output from the first command:
Code: Select all
+--------------------------------------------+------------+
| Table | Size in MB |
+--------------------------------------------+------------+
| nagios_acknowledgements | 0.44 |
| nagios_commands | 0.06 |
| nagios_commenthistory | 1091.58 |
| nagios_comments | 3.67 |
| nagios_configfiles | 0.03 |
| nagios_configfilevariables | 0.02 |
| nagios_conninfo | 0.02 |
| nagios_contact_addresses | 0.03 |
| nagios_contact_notificationcommands | 0.03 |
| nagios_contactgroup_members | 0.03 |
| nagios_contactgroups | 0.03 |
| nagios_contactnotificationmethods | 1.91 |
| nagios_contactnotifications | 1.05 |
| nagios_contacts | 0.03 |
| nagios_contactstatus | 0.03 |
| nagios_customvariables | 0.63 |
| nagios_customvariablestatus | 0.63 |
| nagios_dbversion | 0.02 |
| nagios_downtimehistory | 20.03 |
| nagios_eventhandlers | 4.30 |
| nagios_externalcommands | 0.05 |
| nagios_flappinghistory | 450.88 |
| nagios_host_contactgroups | 0.17 |
| nagios_host_contacts | 0.03 |
| nagios_host_parenthosts | 0.03 |
| nagios_hostchecks | 1.67 |
| nagios_hostdependencies | 0.03 |
| nagios_hostescalation_contactgroups | 0.30 |
| nagios_hostescalation_contacts | 0.16 |
| nagios_hostescalations | 0.42 |
| nagios_hostgroup_members | 0.17 |
| nagios_hostgroups | 0.03 |
| nagios_hosts | 1.63 |
| nagios_hoststatus | 2.48 |
| nagios_instances | 0.02 |
| nagios_objects | 30.61 |
| nagios_processevents | 0.38 |
| nagios_programstatus | 0.03 |
| nagios_runtimevariables | 0.03 |
| nagios_scheduleddowntime | 0.03 |
| nagios_service_contactgroups | 3.03 |
| nagios_service_contacts | 4.03 |
| nagios_service_parentservices | 0.03 |
| nagios_servicechecks | 13.06 |
| nagios_servicedependencies | 0.03 |
| nagios_serviceescalation_contactgroups | 3.03 |
| nagios_serviceescalation_contacts | 0.03 |
| nagios_serviceescalations | 4.03 |
| nagios_servicegroup_members | 0.27 |
| nagios_servicegroups | 0.03 |
| nagios_services | 24.58 |
| nagios_servicestatus | 66.59 |
| nagios_statehistory | 395.83 |
| nagios_systemcommands | 0.05 |
| nagios_timedeventqueue | 0.09 |
| nagios_timedevents | 0.09 |
| nagios_timeperiod_timeranges | 0.03 |
| nagios_timeperiods | 0.03 |
| tbl_command | 0.06 |
| tbl_contact | 0.03 |
| tbl_contactgroup | 0.03 |
| tbl_contacttemplate | 0.03 |
| tbl_domain | 0.03 |
| tbl_host | 0.50 |
| tbl_hostdependency | 0.03 |
| tbl_hostescalation | 0.03 |
| tbl_hostextinfo | 0.03 |
| tbl_hostgroup | 0.03 |
| tbl_hosttemplate | 0.03 |
| tbl_info | 0.17 |
| tbl_lnkContactToCommandHost | 0.02 |
| tbl_lnkContactToCommandService | 0.02 |
| tbl_lnkContactToContactgroup | 0.02 |
| tbl_lnkContactToContacttemplate | 0.02 |
| tbl_lnkContactToVariabledefinition | 0.02 |
| tbl_lnkContactgroupToContact | 0.02 |
| tbl_lnkContactgroupToContactgroup | 0.02 |
| tbl_lnkContacttemplateToCommandHost | 0.02 |
| tbl_lnkContacttemplateToCommandService | 0.02 |
| tbl_lnkContacttemplateToContactgroup | 0.02 |
| tbl_lnkContacttemplateToContacttemplate | 0.02 |
| tbl_lnkContacttemplateToVariabledefinition | 0.02 |
| tbl_lnkHostToContact | 0.02 |
| tbl_lnkHostToContactgroup | 0.02 |
| tbl_lnkHostToHost | 0.02 |
| tbl_lnkHostToHostgroup | 0.09 |
| tbl_lnkHostToHosttemplate | 0.11 |
| tbl_lnkHostToVariabledefinition | 0.02 |
| tbl_lnkHostdependencyToHost_DH | 0.02 |
| tbl_lnkHostdependencyToHost_H | 0.02 |
| tbl_lnkHostdependencyToHostgroup_DH | 0.02 |
| tbl_lnkHostdependencyToHostgroup_H | 0.02 |
| tbl_lnkHostescalationToContact | 0.02 |
| tbl_lnkHostescalationToContactgroup | 0.02 |
| tbl_lnkHostescalationToHost | 0.02 |
| tbl_lnkHostescalationToHostgroup | 0.02 |
| tbl_lnkHostgroupToHost | 0.02 |
| tbl_lnkHostgroupToHostgroup | 0.02 |
| tbl_lnkHosttemplateToContact | 0.02 |
| tbl_lnkHosttemplateToContactgroup | 0.02 |
| tbl_lnkHosttemplateToHost | 0.02 |
| tbl_lnkHosttemplateToHostgroup | 0.02 |
| tbl_lnkHosttemplateToHosttemplate | 0.02 |
| tbl_lnkHosttemplateToVariabledefinition | 0.02 |
| tbl_lnkServiceToContact | 0.02 |
| tbl_lnkServiceToContactgroup | 0.02 |
| tbl_lnkServiceToHost | 0.05 |
| tbl_lnkServiceToHostgroup | 0.02 |
| tbl_lnkServiceToServicegroup | 0.02 |
| tbl_lnkServiceToServicetemplate | 0.06 |
| tbl_lnkServiceToVariabledefinition | 0.02 |
| tbl_lnkServicedependencyToHost_DH | 0.02 |
| tbl_lnkServicedependencyToHost_H | 0.02 |
| tbl_lnkServicedependencyToHostgroup_DH | 0.02 |
| tbl_lnkServicedependencyToHostgroup_H | 0.02 |
| tbl_lnkServicedependencyToService_DS | 0.02 |
| tbl_lnkServicedependencyToService_S | 0.02 |
| tbl_lnkServicedependencyToServicegroup_DS | 0.02 |
| tbl_lnkServicedependencyToServicegroup_S | 0.02 |
| tbl_lnkServiceescalationToContact | 0.02 |
| tbl_lnkServiceescalationToContactgroup | 0.02 |
| tbl_lnkServiceescalationToHost | 0.02 |
| tbl_lnkServiceescalationToHostgroup | 0.02 |
| tbl_lnkServiceescalationToService | 0.02 |
| tbl_lnkServiceescalationToServicegroup | 0.02 |
| tbl_lnkServicegroupToService | 0.02 |
| tbl_lnkServicegroupToServicegroup | 0.02 |
| tbl_lnkServicetemplateToContact | 0.02 |
| tbl_lnkServicetemplateToContactgroup | 0.02 |
| tbl_lnkServicetemplateToHost | 0.02 |
| tbl_lnkServicetemplateToHostgroup | 0.02 |
| tbl_lnkServicetemplateToServicegroup | 0.02 |
| tbl_lnkServicetemplateToServicetemplate | 0.02 |
| tbl_lnkServicetemplateToVariabledefinition | 0.02 |
| tbl_lnkTimeperiodToTimeperiod | 0.02 |
| tbl_logbook | 0.02 |
| tbl_mainmenu | 0.02 |
| tbl_permission | 0.02 |
| tbl_permission_inactive | 0.02 |
| tbl_service | 0.13 |
| tbl_servicedependency | 0.03 |
| tbl_serviceescalation | 0.03 |
| tbl_serviceextinfo | 0.03 |
| tbl_servicegroup | 0.03 |
| tbl_servicetemplate | 0.03 |
| tbl_session | 0.02 |
| tbl_session_locks | 0.02 |
| tbl_settings | 0.03 |
| tbl_submenu | 0.02 |
| tbl_timedefinition | 0.02 |
| tbl_timeperiod | 0.03 |
| tbl_user | 0.03 |
| tbl_variabledefinition | 0.02 |
| xi_auditlog | 162.72 |
| xi_auth_tokens | 0.03 |
| xi_cmp_ccm_backups | 0.02 |
| xi_cmp_favorites | 0.03 |
| xi_cmp_nagiosbpi_backups | 1.50 |
| xi_cmp_scheduledreports_log | 0.02 |
| xi_cmp_trapdata | 0.50 |
| xi_cmp_trapdata_log | 0.03 |
| xi_commands | 0.02 |
| xi_deploy_agents | 0.02 |
| xi_deploy_jobs | 0.02 |
| xi_eventqueue | 0.03 |
| xi_events | 1209.45 |
| xi_incidents | 0.02 |
| xi_meta | 20400.00 |
| xi_mibs | 0.05 |
| xi_options | 0.06 |
| xi_sessions | 0.03 |
| xi_sysstat | 0.03 |
| xi_usermeta | 0.17 |
| xi_users | 0.03 |
+--------------------------------------------+------------+
nagios_notifications does indeed appear to be missing, and the files are gone too:
Code: Select all
[root@******** ~]# ls -lh /var/lib/mysql/nagios | grep notifications
-rw-rw----. 1 mysql mysql 2.6K Oct 12 13:56 nagios_contactnotifications.frm
-rw-rw----. 1 mysql mysql 7.0M Oct 12 13:56 nagios_contactnotifications.ibd
I have no idea how that could have happened.
Those do appear to be the correct numbers for hosts and services, at least.
/var/nagiosramdisk does get full when Nagios isn't processing passive check results fast enough. There are about 4000 checks that are stale right now:
Code: Select all
[nagios@********~]$ du -sh /var/nagiosramdisk/*
16K /var/nagiosramdisk/host-perfdata
71M /var/nagiosramdisk/objects.cache
184K /var/nagiosramdisk/service-perfdata
464M /var/nagiosramdisk/spool
106M /var/nagiosramdisk/status.dat
0 /var/nagiosramdisk/tmp
I have a couple of hourly cron jobs to help keep it from getting completely full when things get slow. I figured it was better to have stale checks than run out of space on the filesystem:
Code: Select all
/bin/find /var/nagiosramdisk/spool/perfdata -type f -mmin +60 -delete
/bin/find /var/nagiosramdisk/spool/checkresults -type f -mmin +60 -delete