Page 1 of 1

Problem with ndo2db(nod2db updates nagios tables with delay)

Posted: Wed Jun 19, 2019 7:37 am
by HaaMeeD
Hello Dears,

We are working on a new platform and using nagios XI for monitoring. since we start testing monitoring with about 11 K active check, we realized that our hosts and services status update with delay. Sometimes delay reach to 12 minutes while nagios core show the latest and correct check result.
in below example we are 622 seconds behind. (this is a part of ndo2db debug log)

Code: Select all

[1560945052.589490] [002.0] [pid=89705] INSERT INTO nagios_hoststatus SET instance_id='1', host_object_id='6458', status_update_time=FROM_UNIXTIME(1560944430), output='PING OK - Packet loss = 0%, RTA = 46\.94 ms', long_output='', perfdata='rta=46\.94ms;400\.000000;500\.000000;0\.000000 pl=0%;3;100;0 AVL=100%;0', current_state='0'
I've already read this article which does not help us.
https://support.nagios.com/kb/article.php?id=139
we are using ramdisk and gearmand with 9 workers.
nagios core version is 4.2.4
nagios version = 5.5.11
nagios server :
RAM=20 GB
COUs= 10 Core
SSD Disk
CentOS release 6.10 (Final)
data base is offloaded with this resource and configuration.
RAM=26 GB
CPUs= 16 Core
SSD disk
CentOS Linux release 7.5.1804 (Core)
my.cnf.txt
we tried to find out the problem and the cause but find nothing.
the only error log in messages log file is :

Code: Select all

Jun 19 15:42:58 Nagios-XI nagios: job 127 (pid=63189): read() returned error 11
Jun 19 15:42:58 Nagios-XI nagios: job 128 (pid=63194): read() returned error 11
Jun 19 15:43:02 Nagios-XI nagios: job 128 (pid=63355): read() returned error 11
Jun 19 15:43:04 Nagios-XI nagios: job 129 (pid=63393): read() returned error 11
Jun 19 15:43:07 Nagios-XI nagios: job 129 (pid=63413): read() returned error 11
Jun 19 15:43:10 Nagios-XI nagios: job 130 (pid=63445): read() returned error 11
Jun 19 15:43:13 Nagios-XI nagios: job 131 (pid=63494): read() returned error 11
Jun 19 15:43:13 Nagios-XI nagios: job 132 (pid=63502): read() returned error 11
Jun 19 15:43:20 Nagios-XI nagios: job 134 (pid=63572): read() returned error 11
Jun 19 15:43:22 Nagios-XI nagios: job 134 (pid=63603): read() returned error 11
Jun 19 15:43:22 Nagios-XI nagios: job 135 (pid=63600): read() returned error 11
you can find my system profile in attachment.
profile.zip

Re: Problem with ndo2db(nod2db updates nagios tables with de

Posted: Thu Jun 20, 2019 11:05 am
by cdienger
Try adjusting the mysql config and restarting the database. Here's an example from my lab machine:

Code: Select all

query_cache_size=16M
query_cache_limit=4M
tmp_table_size=64M
max_heap_table_size=64M
key_buffer_size=32M
table_open_cache=32
innodb_file_per_table=1
Your config specifies query-cache-size - is this a typo?

Re: Problem with ndo2db(nod2db updates nagios tables with de

Posted: Sat Jun 22, 2019 4:43 am
by HaaMeeD
Thanks for your reply.

I've changed the Values. the incorrect variables are now fixed. but after restarting mariadb service we still have delay.
the number of messages in ipcs output is increasing and after a while the number starts to reduce and increase again
this is the ipcs command in five minutes interval.

Code: Select all

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
0x0b000002 327680     nagios     600        243793920    238080      
0xffffffff 393217     nagios     600        370235392    361558      


------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
0x0b000002 327680     nagios     600        243793920    238080      
0xffffffff 393217     nagios     600        110864384    108266      

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
0x0b000002 327680     nagios     600        243793920    238080      
0xffffffff 393217     nagios     600        230972416    225559  


is this a database issue ? if it's related to database, according which variables are related ?

Re: Problem with ndo2db(nod2db updates nagios tables with de

Posted: Mon Jun 24, 2019 11:01 am
by cdienger
It's not able to insert the data quick enough. Did you enable jumbo frames as described in https://assets.nagios.com/downloads/nag ... Server.pdf?

Also review page 3 and 4 of https://assets.nagios.com/downloads/nag ... ios-XI.pdf to adjust the reaper settings.

Re: Problem with ndo2db(nod2db updates nagios tables with de

Posted: Wed Jun 26, 2019 7:22 am
by HaaMeeD
No we don't use jumbo frame.
I've already read https://assets.nagios.com/downloads/nag ... 1552472961 and our settings are as below:
check_result_reaper_frequency=3
max_check_result_reaper_time=10

Re: Problem with ndo2db(nod2db updates nagios tables with de

Posted: Wed Jun 26, 2019 4:58 pm
by cdienger
I would look into setting up jumbo frames.

11K is relatively small and I've seen bigger environments running with the local db. Do you plan on going larger? You may be better suited to switching back to the local database.

The performance guide covers other areas that can help improve performance. Implementing some of these may help.