NDOUtils Not working after update to 5.7.2

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
danniiffxi
Posts: 121
Joined: Tue Jan 30, 2018 3:29 am
Location: UK

NDOUtils Not working after update to 5.7.2

Post by danniiffxi »

Hi Guys

This morning I updated our dev server from 5.6.14 to 5.7.2, when it came back up I noticed that the server was no processing any checks,

Every host looks like this:
Image

Although it say the host is up, it have not checked since I ran the update, and has not scheduled any future checks
Image

Pretty much the same view in the Service Checks also.
Image

And as you can see my server stats dashboard is looking green, but nothing is in the processing queue.
Image

Now I did notice this during the upgrade, and I think think has something to do with the problem I am having.

Code: Select all

Performing upgrade...
 > Upgrading from version 2.1.3 (/tmp/nagiosxi/subcomponents/ndo/ndo-3.0.2/db/upgrade-from-2.1.3.sql)
I checked the service status and I got this response.

Code: Select all

[root@nagid01 services]# service ndo2db status
ndo2db: unrecognized service
Our dev server is a clone of Prod and this is also different in the main nagios.cfg
I've checked and the file ndomod.cfg dose not exist in usr/local/nagios/etc on the dev box anymore, there is a copy on the production server.

Code: Select all

# NDOUtils module
# Commented out by NDO 'make install-broker-line' on Fri Jul 24 13:48:00 BST 2020
#broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
This the nagios.cfg for our current prod server (running 5.6.14) which should match the dev box

Code: Select all

# NDOUtils module
broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
This is the content of the /usr/local/nagios/etc/ndo.cfg file

Code: Select all

# Default NDO config for Nagios XI

db_user=ndoutils
db_pass=n@gweb
db_name=nagios
db_host=localhost
db_port=3306
#db_socket=/var/lib/mysql.sock
db_max_reconnect_attempts=5

acknowledgement_data=1
comment_data=1
contact_status_data=1
downtime_data=1
event_handler_data=1
external_command_data=1
flapping_data=1
host_check_data=1
host_status_data=1
log_data=1
main_config_data=1
notification_data=1
object_config_data=1
process_data=1
program_status_data=1
retention_data=1
service_check_data=1
service_status_data=1
state_change_data=1
system_command_data=1
timed_event_data=1

config_output_options=2

max_object_insert_count=250

mysql_set_charset_name=utf8
Last edited by danniiffxi on Fri Jul 24, 2020 9:01 am, edited 2 times in total.
drakedts
Posts: 43
Joined: Tue May 12, 2015 8:28 am

Re: NDOUtils Not working after update to 5.7.2

Post by drakedts »

Good timing; i have this same problem and was working on a posting of my own! Here's what i was going to post:

My XI server runs on RHEL 7 and is installed using the yum repositories on repo.nagios.com. Version 5.6.14 (and earlier) of the nagiosxi-* packages work. The 5.7.x versions are broken though and cause this error in nagios.log:

Code: Select all

[1595596978] Error: Could not load module '/usr/local/nagios/bin/ndomod.o' -> /us
r/local/nagios/bin/ndomod.o: cannot open shared object file: No such file or dire
ctory
[1595596978] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1595596978] Error: Module loading failed. Aborting.
It seems when upgrading to 5.7.x that the ndomod.o file is removed and Nagios breaks. Downgrading to a 5.6 version fixes it. Is this a packaging issue?
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: NDOUtils Not working after update to 5.7.2

Post by cdienger »

ndo2db is no longer used and the nagios.cfg should contain this line now:

Code: Select all

broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
Make sure that the line is there and that the following line is either commented out or removed:

Code: Select all

#broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
and then try restarting the service:

Code: Select all

systemctl restart nagios.cfg
Please provide the /usr/local/nagios/var/nagios.log if there are any further problems.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
danniiffxi
Posts: 121
Joined: Tue Jan 30, 2018 3:29 am
Location: UK

Re: NDOUtils Not working after update to 5.7.2

Post by danniiffxi »

Hi cdienger

I will send you the nagios.log in a PM as I am still having issues. i put the following line into the nagios.cfg as it was not in there.

Code: Select all

broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
The nagios service starts. but terminates shortly after with a memory access violation.

Code: Select all

[1595836384] NDO-3: Database initialized
[1595836384] NDO-3: Database initialized
[1595836384] NDO-3: Unable to prepare statement for query (1): Lost connection to MySQL server during query
[1595836384] Caught SIGSEGV, shutting down...
Our MySQL DB is on the same server as XI. The server in question is a VM with 32GB RAM, so it is not running out of memory.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: NDOUtils Not working after update to 5.7.2

Post by cdienger »

Set the db connection limit and open file limit per https://support.nagios.com/kb/article.php?id=513 and let us know if that changes the message in nagios.log when the service is restarted.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
danniiffxi
Posts: 121
Joined: Tue Jan 30, 2018 3:29 am
Location: UK

Re: NDOUtils Not working after update to 5.7.2

Post by danniiffxi »

cdienger wrote:Set the db connection limit and open file limit per https://support.nagios.com/kb/article.php?id=513 and let us know if that changes the message in nagios.log when the service is restarted.
This is the contents of my /etc/my.cnf file

Code: Select all

[mysqld]
max_allowed_packet=512M
innodb_file_per_table=1
max_connections=1000
open_files_limit = 4096

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
query_cache_size=60M
query_cache_type = 1
query_cache_limit = 256K
query_cache_min_res_unit = 2k
join_buffer_size=256K
thread_cache_size=6

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
I had to add these lines from the guide and restart the relevant services.

Code: Select all

max_connections=1000
open_files_limit = 4096
After 3 attempts at trying to start the service and still seeing Caught SIGSEGV, shutting down... it finally 'somewhat' started. but is still broken and it looks like NDO is disabled.
This is the tail output of /usr/local/nagios/var/nagios.log

Code: Select all

[1595938402] Nagios 4.4.6 starting... (PID=5278)
[1595938402] Local time is Tue Jul 28 13:13:22 BST 2020
[1595938402] LOG VERSION: 2.0
[1595938402] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1595938402] qh: core query handler registered
[1595938402] qh: echo service query handler registered
[1595938402] qh: help for the query handler registered
[1595938402] wproc: Successfully registered manager as @wproc with query handler
[1595938402] wproc: Registry request: name=Core Worker 5280;pid=5280
[1595938402] wproc: Registry request: name=Core Worker 5279;pid=5279
[1595938402] wproc: Registry request: name=Core Worker 5281;pid=5281
[1595938402] wproc: Registry request: name=Core Worker 5283;pid=5283
[1595938402] wproc: Registry request: name=Core Worker 5284;pid=5284
[1595938402] wproc: Registry request: name=Core Worker 5282;pid=5282
[1595938402] wproc: Registry request: name=Core Worker 5285;pid=5285
[1595938402] wproc: Registry request: name=Core Worker 5286;pid=5286
[1595938402] wproc: Registry request: name=Core Worker 5287;pid=5287
[1595938402] NDO-3: NDO 3.0.2 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1595938402] NDO-3: Database initialized
[1595938402] NDO-3: Database initialized
[1595938402] NDO-3: Callbacks registered
[1595938402] NDO-3: Callbacks registered
[1595938402] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1595938402] NDO-3: NDO 3.0.2 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1595938402] NDO-3: Database initialized
[1595938402] NDO-3: Database initialized
[1595938402] NDO-3: Callbacks registered
[1595938402] NDO-3: Callbacks registered
[1595938402] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1595938403] NDO-3: Database initialized
[1595938403] NDO-3: Database initialized
[1595938403] Caught SIGSEGV, shutting down...
[1595938420] Nagios 4.4.6 starting... (PID=5338)
[1595938420] Local time is Tue Jul 28 13:13:40 BST 2020
[1595938420] LOG VERSION: 2.0
[1595938420] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1595938420] qh: core query handler registered
[1595938420] qh: echo service query handler registered
[1595938420] qh: help for the query handler registered
[1595938420] wproc: Successfully registered manager as @wproc with query handler
[1595938420] wproc: Registry request: name=Core Worker 5339;pid=5339
[1595938420] wproc: Registry request: name=Core Worker 5340;pid=5340
[1595938420] wproc: Registry request: name=Core Worker 5342;pid=5342
[1595938420] wproc: Registry request: name=Core Worker 5343;pid=5343
[1595938420] wproc: Registry request: name=Core Worker 5347;pid=5347
[1595938420] wproc: Registry request: name=Core Worker 5341;pid=5341
[1595938420] wproc: Registry request: name=Core Worker 5345;pid=5345
[1595938420] wproc: Registry request: name=Core Worker 5344;pid=5344
[1595938420] wproc: Registry request: name=Core Worker 5346;pid=5346
[1595938420] NDO-3: NDO 3.0.2 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1595938420] NDO-3: Database initialized
[1595938420] NDO-3: Database initialized
[1595938420] NDO-3: Callbacks registered
[1595938420] NDO-3: Callbacks registered
[1595938420] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1595938420] NDO-3: NDO 3.0.2 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1595938420] NDO-3: Database initialized
[1595938420] NDO-3: Database initialized
[1595938420] NDO-3: Callbacks registered
[1595938420] NDO-3: Callbacks registered
[1595938420] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1595938420] NDO-3: Database initialized
[1595938420] NDO-3: Database initialized
[1595938421] Caught SIGSEGV, shutting down...
[1595938434] Nagios 4.4.6 starting... (PID=5459)
[1595938434] Local time is Tue Jul 28 13:13:54 BST 2020
[1595938434] LOG VERSION: 2.0
[1595938434] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1595938434] qh: core query handler registered
[1595938434] qh: echo service query handler registered
[1595938434] qh: help for the query handler registered
[1595938434] wproc: Successfully registered manager as @wproc with query handler
[1595938434] wproc: Registry request: name=Core Worker 5460;pid=5460
[1595938434] wproc: Registry request: name=Core Worker 5461;pid=5461
[1595938434] wproc: Registry request: name=Core Worker 5463;pid=5463
[1595938434] wproc: Registry request: name=Core Worker 5464;pid=5464
[1595938434] wproc: Registry request: name=Core Worker 5465;pid=5465
[1595938434] wproc: Registry request: name=Core Worker 5462;pid=5462
[1595938434] wproc: Registry request: name=Core Worker 5467;pid=5467
[1595938434] wproc: Registry request: name=Core Worker 5466;pid=5466
[1595938434] wproc: Registry request: name=Core Worker 5468;pid=5468
[1595938434] NDO-3: NDO 3.0.2 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1595938434] NDO-3: Database initialized
[1595938434] NDO-3: Database initialized
[1595938434] NDO-3: Callbacks registered
[1595938434] NDO-3: Callbacks registered
[1595938434] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1595938434] NDO-3: NDO 3.0.2 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1595938434] NDO-3: Database initialized
[1595938434] NDO-3: Database initialized
[1595938434] NDO-3: Callbacks registered
[1595938434] NDO-3: Callbacks registered
[1595938434] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1595938435] NDO-3: Database initialized
[1595938435] NDO-3: Database initialized
[1595938435] NDO-3: Unable to prepare statement for query (34): Malformed packet
[1595938435] NDO-3: Unable to prepare statement for query (35): Lost connection to MySQL server during query
[1595938435] Caught SIGSEGV, shutting down...
[1595938457] NDO-3: Error preparing statements
[1595938462] Nagios 4.4.6 starting... (PID=5795)
[1595938462] Local time is Tue Jul 28 13:14:22 BST 2020
[1595938462] LOG VERSION: 2.0
[1595938462] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1595938462] qh: core query handler registered
[1595938462] qh: echo service query handler registered
[1595938462] qh: help for the query handler registered
[1595938462] wproc: Successfully registered manager as @wproc with query handler
[1595938462] wproc: Registry request: name=Core Worker 5796;pid=5796
[1595938462] wproc: Registry request: name=Core Worker 5797;pid=5797
[1595938462] wproc: Registry request: name=Core Worker 5799;pid=5799
[1595938462] wproc: Registry request: name=Core Worker 5801;pid=5801
[1595938462] wproc: Registry request: name=Core Worker 5804;pid=5804
[1595938462] wproc: Registry request: name=Core Worker 5802;pid=5802
[1595938462] wproc: Registry request: name=Core Worker 5800;pid=5800
[1595938462] wproc: Registry request: name=Core Worker 5803;pid=5803
[1595938462] wproc: Registry request: name=Core Worker 5798;pid=5798
[1595938462] NDO-3: NDO 3.0.2 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1595938462] NDO-3: Database initialized
[1595938462] NDO-3: Database initialized
[1595938462] NDO-3: Callbacks registered
[1595938462] NDO-3: Callbacks registered
[1595938462] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1595938462] NDO-3: NDO 3.0.2 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1595938462] NDO-3: Database initialized
[1595938462] NDO-3: Database initialized
[1595938462] NDO-3: Callbacks registered
[1595938462] NDO-3: Callbacks registered
[1595938462] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1595938463] NDO-3: Database initialized
[1595938463] NDO-3: Database initialized
[1595938463] NDO-3: ndo_return = 1 (Commands out of sync; you can't run this command now)
[1595938463] NDO-3: ndo_get_object_id_name1(ndo.c:1135): Unable to store results
[1595938463] NDO-3: ndo_return = 1 (Statement not prepared)
[1595938463] NDO-3: ndo_write_commands(ndo-startup.c:480): Unable to bind parameters
[1595938463] NDO-3: ndo_write_commands() failed. Disabling NDO.
[1595938463] NDO-3: NDO startup thread failed at ndo_write_object_config() - disabling NDO.
[1595938464] Successfully launched command file worker with pid 5809


And although it says that the Monitoring Engine service is running, the Monitoring Engine Process still has a red mark against it and fails to start from the GUI, and I am not seeing any jobs being scheduled

Code: Select all

[root@nagid01 ~]# service nagios status
nagios (pid 26450) is running...
Image
drakedts
Posts: 43
Joined: Tue May 12, 2015 8:28 am

Re: NDOUtils Not working after update to 5.7.2

Post by drakedts »

I updated XI 5.6.14 to 5.7.2 and then made the change in nagios.cfg to the broker_module line. I rebooted the server to make sure everything got restarted cleanly. At first Nagios seemed OK but after a short time the System Component Status shows that both "Monitoring Engine" and "Database Backend" are down. If i click in the UI to restart Monitoring Engine, it shows green but after a few minutes goes back to red.

In the nagios.log i see lines like this repeated:

Code: Select all

[1595949829] NDO-3: Error preparing statements
[1595949829] NDO-3: ndo_handle_service_status(ndo-handlers.c:953): Could not reconnect to MySQL database
[1595949829] NDO-3: Unable to prepare statement for query (27): Unknown column 'check_options' in 'field list'
[1595949829] NDO-3: Unable to prepare statement for query (28): Unknown column 'check_options' in 'field list'
For now I'm going to revert to a snapshot I made of the machine just before the upgrade, but I'm willing to test again and try different things.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: NDOUtils Not working after update to 5.7.2

Post by cdienger »

It looks like we have two seperate issues here.

@danniiffxi: Please open a ticket for this at support.nagios.com/tickets, and reference this forum thread. I have another customer running into this and I'm looking into it.

@drakedts: It seem like the check_options field is missing from the nagios_hoststatus and/or nagios_servicestatus tables of the nagios database. If you can attempt the upgrade again, I would give it a try and then open a new forum thread if the problem persists.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
danniiffxi
Posts: 121
Joined: Tue Jan 30, 2018 3:29 am
Location: UK

Re: NDOUtils Not working after update to 5.7.2

Post by danniiffxi »

Hi cdienger

Support ticket #928242 is now open. I guess this thread can be locked now.

Many thanks
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NDOUtils Not working after update to 5.7.2

Post by scottwilkerson »

danniiffxi wrote:Hi cdienger

Support ticket #928242 is now open. I guess this thread can be locked now.

Many thanks
Locking thread
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked