Problem with ndo2db(nod2db updates nagios tables with delay)

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
HaaMeeD
Posts: 7
Joined: Tue Dec 18, 2018 3:21 am

Problem with ndo2db(nod2db updates nagios tables with delay)

Post by HaaMeeD »

Hello Dears,

We are working on a new platform and using nagios XI for monitoring. since we start testing monitoring with about 11 K active check, we realized that our hosts and services status update with delay. Sometimes delay reach to 12 minutes while nagios core show the latest and correct check result.
in below example we are 622 seconds behind. (this is a part of ndo2db debug log)

Code: Select all

[1560945052.589490] [002.0] [pid=89705] INSERT INTO nagios_hoststatus SET instance_id='1', host_object_id='6458', status_update_time=FROM_UNIXTIME(1560944430), output='PING OK - Packet loss = 0%, RTA = 46\.94 ms', long_output='', perfdata='rta=46\.94ms;400\.000000;500\.000000;0\.000000 pl=0%;3;100;0 AVL=100%;0', current_state='0'
I've already read this article which does not help us.
https://support.nagios.com/kb/article.php?id=139
we are using ramdisk and gearmand with 9 workers.
nagios core version is 4.2.4
nagios version = 5.5.11
nagios server :
RAM=20 GB
COUs= 10 Core
SSD Disk
CentOS release 6.10 (Final)
data base is offloaded with this resource and configuration.
RAM=26 GB
CPUs= 16 Core
SSD disk
CentOS Linux release 7.5.1804 (Core)
my.cnf.txt
we tried to find out the problem and the cause but find nothing.
the only error log in messages log file is :

Code: Select all

Jun 19 15:42:58 Nagios-XI nagios: job 127 (pid=63189): read() returned error 11
Jun 19 15:42:58 Nagios-XI nagios: job 128 (pid=63194): read() returned error 11
Jun 19 15:43:02 Nagios-XI nagios: job 128 (pid=63355): read() returned error 11
Jun 19 15:43:04 Nagios-XI nagios: job 129 (pid=63393): read() returned error 11
Jun 19 15:43:07 Nagios-XI nagios: job 129 (pid=63413): read() returned error 11
Jun 19 15:43:10 Nagios-XI nagios: job 130 (pid=63445): read() returned error 11
Jun 19 15:43:13 Nagios-XI nagios: job 131 (pid=63494): read() returned error 11
Jun 19 15:43:13 Nagios-XI nagios: job 132 (pid=63502): read() returned error 11
Jun 19 15:43:20 Nagios-XI nagios: job 134 (pid=63572): read() returned error 11
Jun 19 15:43:22 Nagios-XI nagios: job 134 (pid=63603): read() returned error 11
Jun 19 15:43:22 Nagios-XI nagios: job 135 (pid=63600): read() returned error 11
you can find my system profile in attachment.
profile.zip
You do not have the required permissions to view the files attached to this post.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Problem with ndo2db(nod2db updates nagios tables with de

Post by cdienger »

Try adjusting the mysql config and restarting the database. Here's an example from my lab machine:

Code: Select all

query_cache_size=16M
query_cache_limit=4M
tmp_table_size=64M
max_heap_table_size=64M
key_buffer_size=32M
table_open_cache=32
innodb_file_per_table=1
Your config specifies query-cache-size - is this a typo?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
HaaMeeD
Posts: 7
Joined: Tue Dec 18, 2018 3:21 am

Re: Problem with ndo2db(nod2db updates nagios tables with de

Post by HaaMeeD »

Thanks for your reply.

I've changed the Values. the incorrect variables are now fixed. but after restarting mariadb service we still have delay.
the number of messages in ipcs output is increasing and after a while the number starts to reduce and increase again
this is the ipcs command in five minutes interval.

Code: Select all

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
0x0b000002 327680     nagios     600        243793920    238080      
0xffffffff 393217     nagios     600        370235392    361558      


------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
0x0b000002 327680     nagios     600        243793920    238080      
0xffffffff 393217     nagios     600        110864384    108266      

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
0x0b000002 327680     nagios     600        243793920    238080      
0xffffffff 393217     nagios     600        230972416    225559  


is this a database issue ? if it's related to database, according which variables are related ?
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Problem with ndo2db(nod2db updates nagios tables with de

Post by cdienger »

It's not able to insert the data quick enough. Did you enable jumbo frames as described in https://assets.nagios.com/downloads/nag ... Server.pdf?

Also review page 3 and 4 of https://assets.nagios.com/downloads/nag ... ios-XI.pdf to adjust the reaper settings.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
HaaMeeD
Posts: 7
Joined: Tue Dec 18, 2018 3:21 am

Re: Problem with ndo2db(nod2db updates nagios tables with de

Post by HaaMeeD »

No we don't use jumbo frame.
I've already read https://assets.nagios.com/downloads/nag ... 1552472961 and our settings are as below:
check_result_reaper_frequency=3
max_check_result_reaper_time=10
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Problem with ndo2db(nod2db updates nagios tables with de

Post by cdienger »

I would look into setting up jumbo frames.

11K is relatively small and I've seen bigger environments running with the local db. Do you plan on going larger? You may be better suited to switching back to the local database.

The performance guide covers other areas that can help improve performance. Implementing some of these may help.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked