Avoid insert host/service macros into DB

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
jmsanesteban.sgre
Posts: 51
Joined: Thu Apr 23, 2020 6:46 am

Avoid insert host/service macros into DB

Post by jmsanesteban.sgre »

Hi community :)

First, some data:
OS: RHEL 7.9
Linux: 3.10.0-1160.31.1.el7.x86_64
NagiosXI: manual installation
DB: Offloaded using official procedure
Jumbo frames: Enabled between frontend and backend

We have an internal ticket created due to a DB insertion problem usind NDOtools so the IPCS messages are increasing and based on how NagiosXI interface works, a delay is appearing in the system as you can see in the attached screenshot: "Delay.png"
Delay.png
We are using host and service macros just to store information about SNMP community, creation date, SNMP port and so on and we've realized that there are tons of DB inserts of those static data, all those thousands of inserts are affecting to the whole system because it seems they are maxing out the connection between Nagios frontend and the offloaded backend:

You can understand the problem checking the current icps messages in the attached screenshoot: "ipcs_messages.png"
ipcs_messages.png
We've tried to modify ndomod.cfg to avoid inserting all unnecessary data, with negative results. We need to know how to reduce the impact of inserting custom macros values, because it is no neccesary to insert all of them every 5 minutes or so, once per day should be enough and in that way the systems get its normal behavior.

BR,
Juanma
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Avoid insert host/service macros into DB

Post by ssax »

You may need to move the DB back local if the kernel message queue isn't able to process fast enough and keeps filling up.

Please PM me a copy of your profile, you can download it from Admin > System Profile by clicking the Download Profile button.

Send the output of these commands as root from your XI server:

Code: Select all

sysctl -p
ulimit -a
su -s /bin/bash -c 'ulimit -a' nagios
And from the offloaded DB:

Code: Select all

ulimit -a
su -s /bin/bash -c 'ulimit -a' mysql
grep -R max_connections /etc/my*
Additionally, please send the output of these commands:
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
This next command may fail, that's okay, not all systems run postgresql, send the output anyways:

Code: Select all

echo "SELECT relname as Table, pg_size_pretty(pg_total_relation_size(relid)) As Size, pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as ExternalSize FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;" | psql nagiosxi nagiosxi
jmsanesteban.sgre
Posts: 51
Joined: Thu Apr 23, 2020 6:46 am

Re: Avoid insert host/service macros into DB

Post by jmsanesteban.sgre »

Thanks for your comments.

I've sent the commands to the DB team so I will give you the result ASAP

sysctl -p

Code: Select all

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv6.conf.default.accept_ra = 0
net.ipv6.conf.default.accept_redirects = 0
net.ipv6.conf.all.accept_ra = 0
net.ipv6.conf.all.accept_redirects = 0
net.ipv6.conf.lo.accept_ra = 0
net.ipv6.conf.lo.accept_redirects = 0
net.ipv4.ip_forward = 0
net.ipv4.route.flush = 1
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.route.flush = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.route.flush = 1
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.route.flush = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.tcp_syncookies = 1
net.ipv6.conf.all.accept_redirects = 0
net.ipv6.conf.default.accept_redirects = 0
net.ipv6.route.flush = 1
fs.suid_dumpable = 0
kernel.randomize_va_space = 2
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
kernel.msgmnb = 796432000
kernel.msgmax = 796432000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.msgmni = 32768
[root@server]# ulimit -a

Code: Select all

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 514928
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 10000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 514928
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[root@server]# su -s /bin/bash -c 'ulimit -a' nagios

Code: Select all

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 514928
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 10000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
I cannot use the query with user root, but I can do it with other users:

Nagios

Code: Select all

+----------------------------------------+------------+
| Table                                  | Size in MB |
+----------------------------------------+------------+
| SAIM_comments_view                     |       NULL | Views for to export data to other tools
| SAIM_old_API_Values_view               |       NULL |
| SAIM_old_view                          |       NULL |
| SAIM_view                              |       NULL |
| nagios_acknowledgements                |       0.34 |
| nagios_commands                        |       0.06 |
| nagios_commenthistory                  |      26.06 |
| nagios_comments                        |       0.31 |
| nagios_configfiles                     |       0.03 |
| nagios_configfilevariables             |       0.02 |
| nagios_conninfo                        |       0.19 |
| nagios_contact_addresses               |       0.03 |
| nagios_contact_notificationcommands    |       0.06 |
| nagios_contactgroup_members            |       0.03 |
| nagios_contactgroups                   |       0.03 |
| nagios_contactnotificationmethods      |       0.14 |
| nagios_contactnotifications            |       0.16 |
| nagios_contacts                        |       0.03 |
| nagios_contactstatus                   |       0.03 |
| nagios_customvariables                 |       2.44 |
| nagios_customvariablestatus            |       0.05 |
| nagios_dbversion                       |       0.02 |
| nagios_downtimehistory                 |       0.03 |
| nagios_eventhandlers                   |       0.06 |
| nagios_externalcommands                |       0.08 |
| nagios_flappinghistory                 |      11.52 |
| nagios_host_contactgroups              |       0.16 |
| nagios_host_contacts                   |       0.03 |
| nagios_host_parenthosts                |       0.03 |
| nagios_hostchecks                      |       0.03 |
| nagios_hostdependencies                |       0.03 |
| nagios_hostescalation_contactgroups    |       0.03 |
| nagios_hostescalation_contacts         |       0.03 |
| nagios_hostescalations                 |       0.03 |
| nagios_hostgroup_members               |       0.39 |
| nagios_hostgroups                      |       0.08 |
| nagios_hosts                           |       1.63 |
| nagios_hoststatus                      |       2.44 |
| nagios_instances                       |       0.02 |
| nagios_logentries                      |     924.11 |
| nagios_notifications                   |       0.31 |
| nagios_objects                         |       1.14 |
| nagios_processevents                   |       0.34 |
| nagios_programstatus                   |       0.03 |
| nagios_runtimevariables                |       0.03 |
| nagios_scheduleddowntime               |       0.03 |
| nagios_service_contactgroups           |       0.22 |
| nagios_service_contacts                |       0.03 |
| nagios_service_parentservices          |       0.03 |
| nagios_servicechecks                   |       0.06 |
| nagios_servicedependencies             |       0.03 |
| nagios_serviceescalation_contactgroups |       0.03 |
| nagios_serviceescalation_contacts      |       0.03 |
| nagios_serviceescalations              |       0.03 |
| nagios_servicegroup_members            |       0.03 |
| nagios_servicegroups                   |       0.03 |
| nagios_services                        |       1.66 |
| nagios_servicestatus                   |       2.80 |
| nagios_statehistory                    |     867.22 |
| nagios_systemcommands                  |       0.05 |
| nagios_timedeventqueue                 |       0.09 |
| nagios_timedevents                     |       0.09 |
| nagios_timeperiod_timeranges           |       0.03 |
| nagios_timeperiods                     |       0.03 |
+----------------------------------------+------------+
64 rows in set (0.00 sec)
NagiosXI

Code: Select all

+---------------------+------------+
| Table               | Size in MB |
+---------------------+------------+
| xi_auditlog         |       9.36 |
| xi_auth_tokens      |       0.03 |
| xi_cmp_trapdata     |       0.03 |
| xi_cmp_trapdata_log |       0.03 |
| xi_commands         |       0.02 |
| xi_eventqueue       |       0.06 |
| xi_events           |       0.36 |
| xi_meta             |       4.02 |
| xi_mibs             |       0.05 |
| xi_options          |       0.06 |
| xi_sessions         |       0.03 |
| xi_sysstat          |       0.03 |
| xi_usermeta         |       0.44 |
| xi_users            |       0.06 |
+---------------------+------------+
Nagiosql

Code: Select all

+--------------------------------------------+------------+
| Table                                      | Size in MB |
+--------------------------------------------+------------+
| tbl_command                                |       0.06 |
| tbl_contact                                |       0.03 |
| tbl_contactgroup                           |       0.03 |
| tbl_contacttemplate                        |       0.03 |
| tbl_domain                                 |       0.03 |
| tbl_host                                   |       0.47 |
| tbl_hostdependency                         |       0.03 |
| tbl_hostescalation                         |       0.03 |
| tbl_hostextinfo                            |       0.03 |
| tbl_hostgroup                              |       0.08 |
| tbl_hosttemplate                           |       0.03 |
| tbl_info                                   |       0.17 |
| tbl_lnkContactToCommandHost                |       0.02 |
| tbl_lnkContactToCommandService             |       0.02 |
| tbl_lnkContactToContactgroup               |       0.02 |
| tbl_lnkContactToContacttemplate            |       0.02 |
| tbl_lnkContactToVariabledefinition         |       0.02 |
| tbl_lnkContactgroupToContact               |       0.02 |
| tbl_lnkContactgroupToContactgroup          |       0.02 |
| tbl_lnkContacttemplateToCommandHost        |       0.02 |
| tbl_lnkContacttemplateToCommandService     |       0.02 |
| tbl_lnkContacttemplateToContactgroup       |       0.02 |
| tbl_lnkContacttemplateToContacttemplate    |       0.02 |
| tbl_lnkContacttemplateToVariabledefinition |       0.02 |
| tbl_lnkHostToContact                       |       0.02 |
| tbl_lnkHostToContactgroup                  |       0.09 |
| tbl_lnkHostToHost                          |       0.02 |
| tbl_lnkHostToHostgroup                     |       0.20 |
| tbl_lnkHostToHosttemplate                  |       0.31 |
| tbl_lnkHostToVariabledefinition            |       0.41 |
| tbl_lnkHostdependencyToHost_DH             |       0.02 |
| tbl_lnkHostdependencyToHost_H              |       0.02 |
| tbl_lnkHostdependencyToHostgroup_DH        |       0.02 |
| tbl_lnkHostdependencyToHostgroup_H         |       0.02 |
| tbl_lnkHostescalationToContact             |       0.02 |
| tbl_lnkHostescalationToContactgroup        |       0.02 |
| tbl_lnkHostescalationToHost                |       0.02 |
| tbl_lnkHostescalationToHostgroup           |       0.02 |
| tbl_lnkHostgroupToHost                     |       0.05 |
| tbl_lnkHostgroupToHostgroup                |       0.02 |
| tbl_lnkHosttemplateToContact               |       0.02 |
| tbl_lnkHosttemplateToContactgroup          |       0.02 |
| tbl_lnkHosttemplateToHost                  |       0.02 |
| tbl_lnkHosttemplateToHostgroup             |       0.02 |
| tbl_lnkHosttemplateToHosttemplate          |       0.02 |
| tbl_lnkHosttemplateToVariabledefinition    |       0.02 |
| tbl_lnkServiceToContact                    |       0.02 |
| tbl_lnkServiceToContactgroup               |       0.02 |
| tbl_lnkServiceToHost                       |       0.14 |
| tbl_lnkServiceToHostgroup                  |       0.02 |
| tbl_lnkServiceToServicegroup               |       0.02 |
| tbl_lnkServiceToServicetemplate            |       0.31 |
| tbl_lnkServiceToVariabledefinition         |       0.33 |
| tbl_lnkServicedependencyToHost_DH          |       0.02 |
| tbl_lnkServicedependencyToHost_H           |       0.02 |
| tbl_lnkServicedependencyToHostgroup_DH     |       0.02 |
| tbl_lnkServicedependencyToHostgroup_H      |       0.02 |
| tbl_lnkServicedependencyToService_DS       |       0.02 |
| tbl_lnkServicedependencyToService_S        |       0.02 |
| tbl_lnkServicedependencyToServicegroup_DS  |       0.02 |
| tbl_lnkServicedependencyToServicegroup_S   |       0.02 |
| tbl_lnkServiceescalationToContact          |       0.02 |
| tbl_lnkServiceescalationToContactgroup     |       0.02 |
| tbl_lnkServiceescalationToHost             |       0.02 |
| tbl_lnkServiceescalationToHostgroup        |       0.02 |
| tbl_lnkServiceescalationToService          |       0.02 |
| tbl_lnkServiceescalationToServicegroup     |       0.02 |
| tbl_lnkServicegroupToService               |       0.02 |
| tbl_lnkServicegroupToServicegroup          |       0.02 |
| tbl_lnkServicetemplateToContact            |       0.02 |
| tbl_lnkServicetemplateToContactgroup       |       0.02 |
| tbl_lnkServicetemplateToHost               |       0.02 |
| tbl_lnkServicetemplateToHostgroup          |       0.02 |
| tbl_lnkServicetemplateToServicegroup       |       0.02 |
| tbl_lnkServicetemplateToServicetemplate    |       0.02 |
| tbl_lnkServicetemplateToVariabledefinition |       0.02 |
| tbl_lnkTimeperiodToTimeperiod              |       0.02 |
| tbl_logbook                                |       0.02 |
| tbl_mainmenu                               |       0.02 |
| tbl_permission                             |       0.02 |
| tbl_permission_inactive                    |       0.02 |
| tbl_service                                |       1.52 |
| tbl_servicedependency                      |       0.03 |
| tbl_serviceescalation                      |       0.03 |
| tbl_serviceextinfo                         |       0.03 |
| tbl_servicegroup                           |       0.03 |
| tbl_servicetemplate                        |       0.03 |
| tbl_session                                |       0.02 |
| tbl_session_locks                          |       0.02 |
| tbl_settings                               |       0.03 |
| tbl_submenu                                |       0.02 |
| tbl_timedefinition                         |       0.05 |
| tbl_timeperiod                             |       0.03 |
| tbl_user                                   |       0.03 |
| tbl_variabledefinition                     |       1.52 |
+--------------------------------------------+------------+
We are not using postgresql.

In any case, is there any way to avoid inserting macro info in nagios_customvariablestatus table?

Thanks in advance,
Juanma.
jmsanesteban.sgre
Posts: 51
Joined: Thu Apr 23, 2020 6:46 am

Re: Avoid insert host/service macros into DB

Post by jmsanesteban.sgre »

[mysql@DBServer ~]$ ulimit -a

Code: Select all

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257123
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 257123
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

[mysql@DBServer ~]$ su -s /bin/bash -c 'ulimit -a' mysql

Code: Select all

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257123
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 257123
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

[mysql@DBServer ~]$ grep -R max_connections /etc/my.cnf

Code: Select all

max_connections = 300
mysql> SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');

Code: Select all

+----------------------------------------+------------+
| Table                                  | Size in MB |
+----------------------------------------+------------+
| SAIM_comments_view                     |       NULL |
| SAIM_old_API_Values_view               |       NULL |
| SAIM_old_view                          |       NULL |
| SAIM_view                              |       NULL |
| nagios_acknowledgements                |       0.34 |
| nagios_commands                        |       0.06 |
| nagios_commenthistory                  |      26.06 |
| nagios_comments                        |       0.31 |
| nagios_configfiles                     |       0.03 |
| nagios_configfilevariables             |       0.02 |
| nagios_conninfo                        |       0.19 |
| nagios_contact_addresses               |       0.03 |
| nagios_contact_notificationcommands    |       0.06 |
| nagios_contactgroup_members            |       0.03 |
| nagios_contactgroups                   |       0.03 |
| nagios_contactnotificationmethods      |       0.14 |
| nagios_contactnotifications            |       0.16 |
| nagios_contacts                        |       0.03 |
| nagios_contactstatus                   |       0.03 |
| nagios_customvariables                 |       2.44 |
| nagios_customvariablestatus            |       0.05 |
| nagios_dbversion                       |       0.02 |
| nagios_downtimehistory                 |       0.03 |
| nagios_eventhandlers                   |       0.06 |
| nagios_externalcommands                |       0.08 |
| nagios_flappinghistory                 |      11.52 |
| nagios_host_contactgroups              |       0.16 |
| nagios_host_contacts                   |       0.03 |
| nagios_host_parenthosts                |       0.03 |
| nagios_hostchecks                      |       0.03 |
| nagios_hostdependencies                |       0.03 |
| nagios_hostescalation_contactgroups    |       0.03 |
| nagios_hostescalation_contacts         |       0.03 |
| nagios_hostescalations                 |       0.03 |
| nagios_hostgroup_members               |       0.39 |
| nagios_hostgroups                      |       0.08 |
| nagios_hosts                           |       1.63 |
| nagios_hoststatus                      |       2.42 |
| nagios_instances                       |       0.02 |
| nagios_logentries                      |     924.11 |
| nagios_notifications                   |       0.31 |
| nagios_objects                         |       1.14 |
| nagios_processevents                   |       0.34 |
| nagios_programstatus                   |       0.03 |
| nagios_runtimevariables                |       0.03 |
| nagios_scheduleddowntime               |       0.03 |
| nagios_service_contactgroups           |       0.22 |
| nagios_service_contacts                |       0.03 |
| nagios_service_parentservices          |       0.03 |
| nagios_servicechecks                   |       0.06 |
| nagios_servicedependencies             |       0.03 |
| nagios_serviceescalation_contactgroups |       0.03 |
| nagios_serviceescalation_contacts      |       0.03 |
| nagios_serviceescalations              |       0.03 |
| nagios_servicegroup_members            |       0.03 |
| nagios_servicegroups                   |       0.03 |
| nagios_services                        |       1.66 |
| nagios_servicestatus                   |       2.78 |
| nagios_statehistory                    |     868.25 |
| nagios_systemcommands                  |       0.05 |
| nagios_timedeventqueue                 |       0.09 |
| nagios_timedevents                     |       0.09 |
| nagios_timeperiod_timeranges           |       0.03 |
| nagios_timeperiods                     |       0.03 |
| xi_auditlog                            |       9.36 |
| xi_auth_tokens                         |       0.03 |
| xi_cmp_trapdata                        |       0.03 |
| xi_cmp_trapdata_log                    |       0.03 |
| xi_commands                            |       0.02 |
| xi_eventqueue                          |       0.03 |
| xi_events                              |       0.36 |
| xi_meta                                |       4.02 |
| xi_mibs                                |       0.05 |
| xi_options                             |       0.06 |
| xi_sessions                            |       0.03 |
| xi_sysstat                             |       0.03 |
| xi_usermeta                            |       0.44 |
| xi_users                               |       0.06 |
+----------------------------------------+------------+
78 rows in set (0.01 sec)
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Avoid insert host/service macros into DB

Post by ssax »

Please PM me a copy of your profile.zip, I do not see it attached in the PM.

What is the output of this command on the XI server:

Code: Select all

sar
Edit this file:

Code: Select all

/etc/sysctl.conf
Change this:

Code: Select all

kernel.msgmni = 32768
To this:

Code: Select all

kernel.msgmni = 512000
Then run these commands:

Code: Select all

systemctl stop nagios.service
systemctl restart ndo2db.service
systemctl start nagios.service
See if that helps.
jmsanesteban.sgre
Posts: 51
Joined: Thu Apr 23, 2020 6:46 am

Re: Avoid insert host/service macros into DB

Post by jmsanesteban.sgre »

sar

Code: Select all

sar
Linux 3.10.0-1160.31.1.el7.x86_64 (Server)         07/09/2021      _x86_64_        (32 CPU)

12:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:01 AM     all      2.58      0.00      0.90      0.02      0.00     96.49
12:20:01 AM     all      2.19      0.00      0.78      0.01      0.00     97.02
12:30:01 AM     all      2.17      0.00      0.78      0.01      0.00     97.04
12:40:01 AM     all      2.32      0.00      0.92      0.01      0.00     96.74
12:50:01 AM     all      2.22      0.00      0.81      0.01      0.00     96.96
01:00:01 AM     all      2.21      0.00      0.81      0.01      0.00     96.97
01:10:01 AM     all      2.24      0.00      0.83      0.02      0.00     96.92
01:20:01 AM     all      2.23      0.00      0.82      0.01      0.00     96.95
01:30:01 AM     all      2.24      0.00      0.83      0.01      0.00     96.92
01:40:01 AM     all      2.33      0.00      0.87      0.01      0.00     96.79
01:50:01 AM     all      2.24      0.00      0.83      0.01      0.00     96.92
02:00:01 AM     all      2.22      0.00      0.81      0.01      0.00     96.97
02:10:01 AM     all      2.25      0.00      0.83      0.02      0.00     96.90
02:20:01 AM     all      2.25      0.00      0.82      0.01      0.00     96.92
02:30:01 AM     all      2.22      0.00      0.81      0.01      0.00     96.96
02:40:01 AM     all      2.20      0.00      0.80      0.01      0.00     96.99
02:50:01 AM     all      2.62      0.00      1.13      0.01      0.00     96.25
03:00:01 AM     all      2.82      0.00      0.98      0.01      0.00     96.18
03:10:01 AM     all      2.65      0.00      0.95      0.01      0.00     96.39
03:20:01 AM     all      2.20      0.01      0.81      0.01      0.00     96.96
03:30:01 AM     all      2.19      0.00      0.79      0.01      0.00     97.01
03:40:01 AM     all      2.19      0.00      0.79      0.01      0.00     97.01
03:50:01 AM     all      2.19      0.00      0.79      0.01      0.00     97.01
04:00:01 AM     all      2.18      0.00      0.79      0.01      0.00     97.02
04:10:01 AM     all      2.17      0.00      0.78      0.01      0.00     97.03
04:20:01 AM     all      2.19      0.00      0.79      0.01      0.00     97.00
04:30:01 AM     all      2.18      0.00      0.78      0.01      0.00     97.03
04:40:01 AM     all      2.37      0.00      0.81      0.01      0.00     96.81
04:50:01 AM     all      2.47      0.00      0.81      0.01      0.00     96.71
05:00:01 AM     all      2.27      0.00      0.80      0.01      0.00     96.92
05:10:01 AM     all      2.23      0.00      0.78      0.02      0.00     96.97
05:20:01 AM     all      2.21      0.00      0.78      0.02      0.00     96.99
05:30:01 AM     all      2.23      0.00      0.79      0.01      0.00     96.97
05:40:01 AM     all      2.33      0.00      0.92      0.02      0.00     96.74
05:50:01 AM     all      2.21      0.00      0.78      0.01      0.00     96.99
06:00:01 AM     all      2.22      0.00      0.80      0.01      0.00     96.98
06:10:01 AM     all      2.14      0.00      0.79      0.02      0.00     97.06
06:20:01 AM     all      2.12      0.00      0.78      0.01      0.00     97.08
06:30:01 AM     all      2.16      0.00      0.80      0.01      0.00     97.02
06:40:01 AM     all      2.24      0.00      0.84      0.02      0.00     96.91
06:50:01 AM     all      2.13      0.00      0.80      0.01      0.00     97.06
07:00:01 AM     all      2.12      0.00      0.78      0.01      0.00     97.08
07:10:01 AM     all      2.15      0.00      0.81      0.02      0.00     97.02
07:20:01 AM     all      2.16      0.00      0.80      0.01      0.00     97.03
07:30:01 AM     all      2.13      0.00      0.80      0.01      0.00     97.06
07:40:01 AM     all      2.15      0.00      0.81      0.02      0.00     97.02
07:50:01 AM     all      2.13      0.00      0.80      0.01      0.00     97.06
08:00:01 AM     all      2.10      0.00      0.78      0.02      0.00     97.11
08:10:01 AM     all      2.09      0.00      0.78      0.03      0.00     97.11
08:20:01 AM     all      2.11      0.00      0.78      0.02      0.00     97.10
08:30:01 AM     all      2.13      0.00      0.77      0.02      0.00     97.08

08:30:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
08:40:01 AM     all      2.12      0.00      0.77      0.03      0.00     97.08
08:50:01 AM     all      2.26      0.00      0.79      0.01      0.00     96.94
09:00:01 AM     all      2.22      0.00      0.78      0.01      0.00     96.99
09:10:01 AM     all      2.42      0.00      0.80      0.02      0.00     96.75
Average:        all      2.24      0.00      0.82      0.01      0.00     96.93

In regards of this change: kernel.msgmni = 32768 If a remember well there is a limitation on this value:

https://access.redhat.com/solutions/4968021
kernel.msgmni.png
I thought it was attached, sorry I will "resend" :)

BR,
Juanma.
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Avoid insert host/service macros into DB

Post by ssax »

Interesting, I didn't know that, thanks for linking the article.

If the kernel message queue isn't processing fast enough with the DB offloaded the only fix I've found is to move the DB back local to the XI server.

What is the output of this command?

Code: Select all

mysql -uroot -pnagiosxi -h X.X.X.X nagios -e "select count(*) from nagios_objects;"

Code: Select all

mysql -uroot -pnagiosxi -h X.X.X.X nagios -e "show full processlist;"
How does the load/wait look on the offloaded DB?

Code: Select all

sar
jmsanesteban.sgre
Posts: 51
Joined: Thu Apr 23, 2020 6:46 am

Re: Avoid insert host/service macros into DB

Post by jmsanesteban.sgre »

Hi all!

First off all thanks for your comments.

select count(*) from nagios_objects;

Code: Select all

9053
show full processlist;

Code: Select all

'4521418', 'nagios', '$Jumphost:55843', NULL, 'Sleep', '294', '', NULL
'4521419', 'nagios', '$Jumphost:55844', NULL, 'Sleep', '294', '', NULL
'4522493', 'nagios', '$ServerIP:60728', 'nagios', 'Sleep', '6299', '', NULL
'4532006', 'nagios', '$ServerIP:44234', 'nagios', 'Sleep', '2229', '', NULL
'4532007', 'nagios', '$ServerIP:44236', 'nagios', 'Query', '0', 'query end', 'INSERT INtO nagios_customvariablestatus SET instance_id=\'1\', object_id=\'4408\',status_update_time=FROM_UNIXTIME(1626079597), has_been_modified=\'0\', varname=\'CREATION_DATE\', varvalue=\'2021-03-03\' ON DUPLICATE KEY UPDATE instance_id=\'1\', object_id=\'4408\',status_update_time=FROM_UNIXTIME(1626079597), has_been_modified=\'0\', varname=\'CREATION_DATE\', varvalue=\'2021-03-03\''
'4534275', 'nagios', '$Jumphost:34413', NULL, 'Sleep', '42', '', NULL
'4534276', 'nagios', '$Jumphost:34414', 'nagios', 'Query', '0', 'starting', 'show full processlist'
'4536033', 'nagios', '$ServerIP:50558', 'nagios', 'Sleep', '29', '', NULL
'4536050', 'nagios', '$ServerIP:50602', 'nagios', 'Sleep', '7', '', NULL
'4536054', 'nagios', '$ServerIP:50610', 'nagios', 'Sleep', '7', '', NULL
'4536086', 'nagios', '$ServerIP:50672', 'nagios', 'Sleep', '74', '', NULL
'4536097', 'nagios', '$ServerIP:50698', 'nagios', 'Sleep', '7', '', NULL
'4536135', 'nagios', '$ServerIP:50726', 'nagios', 'Sleep', '15', '', NULL
'4536146', 'nagios', '$ServerIP:50736', 'nagios', 'Sleep', '2', '', NULL
'4536232', 'nagios', '$ServerIP:50784', 'nagios', 'Sleep', '3', '', NULL
'4536244', 'nagios', '$ServerIP:50810', 'nagios', 'Sleep', '3', '', NULL
'4536247', 'nagios', '$ServerIP:50816', 'nagios', 'Sleep', '15', '', NULL
'4536262', 'nagios', '$ServerIP:50836', 'nagios', 'Sleep', '14', '', NULL
'4536267', 'nagios', '$ServerIP:50846', 'nagios', 'Sleep', '14', '', NULL
'4536268', 'nagios', '$ServerIP:50848', 'nagios', 'Sleep', '14', '', NULL
'4536270', 'nagios', '$ServerIP:50850', 'nagios', 'Sleep', '14', '', NULL
'4536274', 'nagios', '$ServerIP:50860', 'nagios', 'Sleep', '14', '', NULL
'4536278', 'nagios', '$ServerIP:50868', 'nagios', 'Sleep', '14', '', NULL
sar output is going to take more time because it depends on other team, I will update that with sar output ASAP
Last edited by jmsanesteban.sgre on Wed Jul 14, 2021 12:43 am, edited 1 time in total.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Avoid insert host/service macros into DB

Post by ssax »

That output looks good. I don't see anything locking the DB tables or in the mysql process list.

If the sar output doesn't show any issues, please create a ticket for this and include a link back to this forum thread so we can get a remote session setup to debug further:

https://support.nagios.com/tickets/

There is a high probability that you will need to migrate the DB back local though (I've dealt with a bunch of these) as sometimes even having the extra layer of network is too much for the system to process the kernel message queue fast enough.

It's weird that this is happening with roughly 5500 total checks, is that an accurate estimation of the total amount of host/service checks combined?
jmsanesteban.sgre
Posts: 51
Joined: Thu Apr 23, 2020 6:46 am

Re: Avoid insert host/service macros into DB

Post by jmsanesteban.sgre »

Hi all!

Thanks for your input.


sar 3 8

Code: Select all

Linux 4.18.0-240.22.1.el8_3.x86_64 ($ServerDB)        07/12/2021      _x86_64_        (16 CPU)


11:44:32 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
11:44:35 AM     all      6.68      1.42      7.85      2.88      0.00     81.16
11:44:38 AM     all      8.16      1.23      8.10      2.82      0.00     79.69
11:44:41 AM     all      6.98      1.48      7.04      2.72      0.00     81.78
11:44:44 AM     all      7.77      0.96      7.10      2.65      0.00     81.52
11:44:47 AM     all      4.08      0.56      7.42      2.34      0.00     85.59
11:44:50 AM     all      1.34      0.67      8.19      2.40      0.00     87.40
11:44:53 AM     all      3.18      0.65      8.36      2.66      0.00     85.16
11:44:56 AM     all      1.08      0.48      5.71      2.54      0.00     90.18
Average:        all      4.91      0.93      7.47      2.63      0.00     84.06
[root@$ServerDB ~]#
I'm not able to see errors in that output.

We have been sufering this problem for too much time and we did several tests and checks, there is no deadlocks in database, no long queries, no performance problems in DB or Frontend server, but the fact of having the database offloaded is generating this delay in the insertion process. There is an opened ticket without the expected resolution (https://support.nagios.com/tickets/tickets.php?id=13867)

I thought it could be a good idea to use different ndo2db processes running from different servers using ndo2file or any other just to split the workload, but it is not a supported solution.

The point is trying to keep frontend and backend in different servers, this will allow us to grow focus on our needs. The problem is that by dessign the product makes intense use of the DB, for instance inserting host and service macros without the possibility to disable them using a query per insert. Just to show the problem... We are using a host macro to store the asset SNMP community, some ports, creation date, and so on... I didn't check the cycle, but I suppose that that information is inserted every 5 minutes. Those values won't change so maybe we need to insert that once per day, so 288 insertions vs 1 per day, variable and host


This is why my question is if there is a way to disable the insertion of custom macros in customvariablestatus table, because there are tons of insertions. I've been playing with ndomod options without good results.

Based on that...are all med to big installations running in local mode?

Thanks for your help.

BR,
Juanma.
Locked