Nagios XI - upg v5.6.14 to 5.8.6 issues

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Interrex
Posts: 68
Joined: Thu May 19, 2016 8:42 am

Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by Interrex »

Hi.

Server is RHEL7, vmware.

I did the upgrade from 5.6.14 to 5.8.6 - all looked good at first overlook, until I applied new configuration.
Then the process state went from green to red and failed to startup, the web page is working, but none of the the Monitoring Engine Processes.


Nagios service output:
:/tmp/rpms/nagiosxi $ systemctl status nagios
â nagios.service - Nagios Core 4.4.6
Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2021-09-22 12:51:04 CEST; 3min 15s ago
Docs: https://www.nagios.org/documentation
Process: 26138 ExecStopPost=/bin/rm -f /usr/local/nagios/var/rw/nagios.cmd (code=exited, status=0/SUCCESS)
Process: 23703 ExecStop=/bin/kill -s TERM ${MAINPID} (code=exited, status=1/FAILURE)
Process: 26111 ExecStart=/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Process: 26107 ExecStartPre=/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Main PID: 26113 (code=exited, status=1/FAILURE)

Sep 22 12:51:02 nagios[26107]: Checked 0 host dependencies
Sep 22 12:51:02 nagios[26107]: Checked 48 timeperiods
Sep 22 12:51:02 nagios[26107]: Checking global event handlers...
Sep 22 12:51:02 nagios[26107]: Checking obsessive compulsive processor commands...
Sep 22 12:51:02 nagios[26107]: Checking misc settings...
Sep 22 12:51:02 nagios[26107]: Total Warnings: 440
Sep 22 12:51:02 nagios[26107]: Total Errors: 0
Sep 22 12:51:02 nagios[26107]: Things look okay - No serious problems were detected during the pre-flight check
Sep 22 12:51:04 systemd[1]: Unit nagios.service entered failed state.
Sep 22 12:51:04 systemd[1]: nagios.service failed.


/usr/local/nagios/var/nagios.log
[1632307384] Caught SIGTERM, shutting down...
[1632307384] Caught SIGTERM, shutting down...
[1632307384] Caught SIGTERM, shutting down...
[1632307384] Successfully shutdown... (PID=6270)
[1632307384] ndomod: Shutdown complete.
[1632307384] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1632307384] Nagios 4.4.6 starting... (PID=22367)
[1632307384] Local time is Wed Sep 22 12:43:04 CEST 2021
[1632307384] LOG VERSION: 2.0
[1632307384] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1632307384] qh: core query handler registered
[1632307384] qh: echo service query handler registered
[1632307384] qh: help for the query handler registered
[1632307384] wproc: Successfully registered manager as @wproc with query handler
[1632307384] wproc: Registry request: name=Core Worker 22369;pid=22369
[1632307384] wproc: Registry request: name=Core Worker 22373;pid=22373
[1632307384] wproc: Registry request: name=Core Worker 22371;pid=22371
[1632307384] wproc: Registry request: name=Core Worker 22372;pid=22372
[1632307384] wproc: Registry request: name=Core Worker 22368;pid=22368
[1632307384] wproc: Registry request: name=Core Worker 22370;pid=22370
[1632307384] Error: Could not load module '/usr/local/nagios/bin/ndomod.o' -> /usr/local/nagios/bin/ndomod.o: cannot open shared object file: No such file or directory
[1632307384] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1632307384] Error: Module loading failed. Aborting.


And when I have a look at the snapshots:
it says Diff Changes = -282 on the first reload, and then +36 on a later atempt.
Date Created Config Status Snapshot Name Diff Changes
22.09.2021 12:49 OK Snapshot 1632307777 36
22.09.2021 12:45 OK Snapshot 1632307541 N/A
22.09.2021 12:43 OK Snapshot 1632307385 -282
22.09.2021 12:27 OK Snapshot 1632306449 N/A

I did a snapshot before the upgrade 12:27
All was good and OK after deployment of version 5.8.6, when checking Monitoring Process -> Process Info + Performance all was green and running.

Any idea what went wrong?
Please let me know what additional info you need for support :)
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by pbroste »

Hello @Interrex

Thanks for reaching out; let start off with the full journalctl output on the nagios.service.

Code: Select all

journalctl -u nagios.service > /tmp/results.txt
Let's take a look at the '/tmp/results.txt' to see what is going on.

Thanks,
Perry
Interrex
Posts: 68
Joined: Thu May 19, 2016 8:42 am

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by Interrex »

pbroste wrote:Hello @Interrex

Thanks for reaching out; let start off with the full journalctl output on the nagios.service.
Thanks Perry, I have sendt you the output in PM

brgds
interrex
Interrex
Posts: 68
Joined: Thu May 19, 2016 8:42 am

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by Interrex »

Here's a part of the output from command :

I can send the complete log files to other Nagios employee aswell

Here's a part of the output from that might be interesting:

NagiosXI 5.6.14 - before upgrade
Sep 22 12:23:23 servername nagios[31737]: Successfully launched command file worker with pid 31915
Sep 22 12:24:30 servername check_nrpe[683]: Remote "IPADDRESS" does not support Version 3 Packets
Sep 22 12:24:30 servername check_nrpe[683]: Remote "IPADDRESS" accepted a Version 2 Packet
Sep 22 12:26:26 servername sudo[2802]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:26:26 servername sudo[2802]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:26:26 servername sudo[2802]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status httpd
Sep 22 12:26:31 servername sudo[2863]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:26:31 servername sudo[2863]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:26:31 servername sudo[2863]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status crond
Sep 22 12:26:31 servername sudo[2873]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:26:31 servername sudo[2873]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:26:31 servername sudo[2873]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status mysqld
Sep 22 12:26:31 servername check_nrpe[2888]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:26:31 servername check_nrpe[2888]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:26:48 servername sudo[3343]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:26:48 servername sudo[3343]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:26:48 servername sudo[3343]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status ndo2db
Sep 22 12:27:23 servername check_nrpe[5917]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:27:23 servername check_nrpe[5917]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:27:28 servername systemd[1]: Stopping Nagios Core 4.4.5...
Sep 22 12:27:28 servername nagios[31737]: Caught SIGTERM, shutting down...
Sep 22 12:27:28 servername nagios[31737]: Caught SIGTERM, shutting down...
Sep 22 12:27:28 servername nagios[31915]: Caught SIGTERM, shutting down...
Sep 22 12:27:28 servername nagios[31737]: Successfully shutdown... (PID=31737)
Sep 22 12:27:28 servername nagios[31737]: ndomod: Shutdown complete.
Sep 22 12:27:28 servername nagios[31737]: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Sep 22 12:27:28 servername systemd[1]: Stopped Nagios Core 4.4.5.
Sep 22 12:27:28 servername systemd[1]: Starting Nagios Core 4.4.5...
Sep 22 12:27:28 servername nagios[6265]: Nagios Core 4.4.6
Sep 22 12:27:28 servername nagios[6265]: Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Sep 22 12:27:28 servername nagios[6265]: Copyright (c) 1999-2009 Ethan Galstad
Sep 22 12:27:28 servername nagios[6265]: Last Modified: 2020-04-28
Sep 22 12:27:28 servername nagios[6265]: License: GPL
Sep 22 12:27:28 servername nagios[6265]: Website: https://www.nagios.org
Sep 22 12:27:28 servername nagios[6265]: Reading configuration data...
Sep 22 12:27:28 servername nagios[6265]: Read main config file okay...
Sep 22 12:27:28 servername nagios[6265]: Warning: Duplicate definition found for service 'SERVICE' on host 'HOST' (config file '/usr/local/nagios/etc/services/HOST.cfg', starting on line 16)
Sep 22 12:27:28 servername nagios[6265]: Read object config files okay...
Sep 22 12:27:28 servername nagios[6265]: Running pre-flight check on configuration data...
Sep 22 12:27:28 servername nagios[6265]: Checking objects...
Sep 22 12:27:28 servername systemd[1]: Started Nagios Core 4.4.5.
Sep 22 12:27:28 servername nagios[6270]: Nagios 4.4.6 starting... (PID=6270)
Sep 22 12:27:28 servername nagios[6270]: Local time is Wed Sep 22 12:27:28 CEST 2021
Sep 22 12:27:28 servername nagios[6270]: LOG VERSION: 2.0
Sep 22 12:27:28 servername nagios[6270]: qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
Sep 22 12:27:28 servername nagios[6270]: qh: core query handler registered
Sep 22 12:27:28 servername nagios[6270]: qh: echo service query handler registered
Sep 22 12:27:28 servername nagios[6270]: qh: help for the query handler registered
Sep 22 12:27:28 servername nagios[6270]: wproc: Successfully registered manager as @wproc with query handler
Sep 22 12:27:28 servername nagios[6270]: wproc: Registry request: name=Core Worker 6272;pid=6272
Sep 22 12:27:28 servername nagios[6270]: wproc: Registry request: name=Core Worker 6273;pid=6273
Sep 22 12:27:28 servername nagios[6270]: wproc: Registry request: name=Core Worker 6271;pid=6271
Sep 22 12:27:28 servername nagios[6265]: Checked 2049 services.
Sep 22 12:27:28 servername nagios[6270]: wproc: Registry request: name=Core Worker 6274;pid=6274
Sep 22 12:27:28 servername nagios[6270]: wproc: Registry request: name=Core Worker 6275;pid=6275
Sep 22 12:27:28 servername nagios[6270]: wproc: Registry request: name=Core Worker 6276;pid=6276
Sep 22 12:27:28 servername nagios[6270]: ndomod: NDOMOD 2.1.3 (2017-04-13) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Sep 22 12:27:28 servername nagios[6270]: ndomod: Successfully connected to data sink. 0 queued items to flush.
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for process data
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for log data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for system command data'
Sep 22 12:27:28 servername nagios[6265]: Checked 822 hosts.
Sep 22 12:27:28 servername nagios[6265]: Checked 33 host groups.
Sep 22 12:27:28 servername nagios[6265]: Checked 25 service groups.
Sep 22 12:27:28 servername nagios[6265]: Checked 40 contacts.
Sep 22 12:27:28 servername nagios[6265]: Checked 60 contact groups.
Sep 22 12:27:28 servername nagios[6265]: Checked 151 commands.
Sep 22 12:27:28 servername nagios[6265]: Checked 48 time periods.
Sep 22 12:27:28 servername nagios[6265]: Checked 0 host escalations.
Sep 22 12:27:28 servername nagios[6265]: Checked 0 service escalations.
Sep 22 12:27:28 servername nagios[6265]: Checking for circular paths...
Sep 22 12:27:28 servername nagios[6265]: Checked 822 hosts
Sep 22 12:27:28 servername nagios[6265]: Checked 0 service dependencies
Sep 22 12:27:28 servername nagios[6265]: Checked 0 host dependencies
Sep 22 12:27:28 servername nagios[6265]: Checked 48 timeperiods
Sep 22 12:27:28 servername nagios[6265]: Checking global event handlers...
Sep 22 12:27:28 servername nagios[6265]: Checking obsessive compulsive processor commands...
Sep 22 12:27:28 servername nagios[6265]: Checking misc settings...
Sep 22 12:27:28 servername nagios[6265]: Total Warnings: 440
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for event handler data'
Sep 22 12:27:28 servername nagios[6265]: Total Errors: 0
Sep 22 12:27:28 servername nagios[6265]: Things look okay - No serious problems were detected during the pre-flight check
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for notification data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for comment data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for downtime data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for flapping data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for program status data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for host status data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for service status data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for adaptive program data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for adaptive host data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for adaptive service data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for external command data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for aggregated status data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for retention data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for contact data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for contact notification data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for acknowledgement data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for state change data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for contact status data'
Sep 22 12:27:28 servername nagios[6270]: ndomod registered for adaptive contact data'
Sep 22 12:27:28 servername nagios[6270]: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Sep 22 12:27:43 servername nagios[6270]: Successfully launched command file worker with pid 6623
Sep 22 12:27:50 servername nagios[6270]: SERVICE ALERT: localhost;Current Load;WARNING;SOFT;1;WARNING - load average: 8.13, 2.53, 1.23
Sep 22 12:28:49 servername nagios[6270]: SERVICE ALERT: localhost;Current Load;OK;SOFT;2;OK - load average: 3.34, 2.14, 1.17

Below here nagios service fails to start, I belive this is after / under upgrade ? Nagios Core 4.4.6 is starting on timestamp 12:27:28 also.

Sep 22 12:29:28 servername check_nrpe[8517]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:29:28 servername check_nrpe[8517]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:31:25 servername sudo[10471]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:31:25 servername sudo[10471]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:31:25 servername sudo[10471]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status httpd
Sep 22 12:31:28 servername check_nrpe[10541]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:31:28 servername check_nrpe[10541]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:31:28 servername sudo[10545]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:31:28 servername sudo[10545]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:31:28 servername sudo[10545]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status crond
Sep 22 12:31:28 servername sudo[10555]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:31:28 servername sudo[10555]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:31:28 servername sudo[10555]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status mysqld
Sep 22 12:31:46 servername sudo[10822]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:31:46 servername sudo[10822]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:31:46 servername sudo[10822]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status ndo2db
Sep 22 12:32:30 servername check_nrpe[11510]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:32:30 servername check_nrpe[11510]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:34:26 servername check_nrpe[13529]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:34:26 servername check_nrpe[13529]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:36:23 servername sudo[15488]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:36:23 servername sudo[15488]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:36:23 servername sudo[15488]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status httpd
Sep 22 12:36:27 servername check_nrpe[15549]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:36:27 servername check_nrpe[15549]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:36:27 servername sudo[15553]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:36:27 servername sudo[15553]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:36:27 servername sudo[15553]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status crond
Sep 22 12:36:28 servername sudo[15558]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:36:28 servername sudo[15558]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:36:28 servername sudo[15558]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status mysqld
Sep 22 12:36:47 servername sudo[15835]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:36:47 servername sudo[15835]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:36:47 servername sudo[15835]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status ndo2db
Sep 22 12:37:26 servername check_nrpe[16476]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:37:26 servername check_nrpe[16476]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:39:25 servername check_nrpe[18515]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:39:25 servername check_nrpe[18515]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:41:22 servername sudo[20515]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:41:22 servername sudo[20515]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:41:22 servername sudo[20515]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status httpd
Sep 22 12:41:26 servername sudo[20566]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:41:26 servername sudo[20566]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:41:26 servername sudo[20566]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status crond
Sep 22 12:41:27 servername check_nrpe[20580]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:41:27 servername check_nrpe[20580]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:41:27 servername sudo[20591]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:41:27 servername sudo[20591]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:41:27 servername sudo[20591]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status mysqld
Sep 22 12:41:46 servername sudo[20866]: PAM unable to dlopen(/usr/lib64/security/pam_krb5.so): /usr/lib64/security/pam_krb5.so: cannot open shared object file: No such file or directory
Sep 22 12:41:46 servername sudo[20866]: PAM adding faulty module: /usr/lib64/security/pam_krb5.so
Sep 22 12:41:46 servername sudo[20866]: nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/local/nagiosxi/scripts/manage_services.sh status ndo2db
Sep 22 12:42:23 servername check_nrpe[21523]: Remote "IPADDRESS" does not support version 3/4 packets
Sep 22 12:42:23 servername check_nrpe[21523]: Remote "IPADDRESS" accepted a version 2 packet
Sep 22 12:43:04 servername systemd[1]: Stopping Nagios Core 4.4.6...
Sep 22 12:43:04 servername nagios[6270]: Caught SIGTERM, shutting down...
Sep 22 12:43:04 servername nagios[6270]: Caught SIGTERM, shutting down...
Sep 22 12:43:04 servername nagios[6623]: Caught SIGTERM, shutting down...
Sep 22 12:43:04 servername nagios[6270]: Successfully shutdown... (PID=6270)
Sep 22 12:43:04 servername nagios[6270]: ndomod: Shutdown complete.
Sep 22 12:43:04 servername nagios[6270]: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Sep 22 12:43:04 servername systemd[1]: Stopped Nagios Core 4.4.6.
Sep 22 12:43:04 servername systemd[1]: Starting Nagios Core 4.4.6...
Sep 22 12:43:04 servername nagios[22363]: Nagios Core 4.4.6
Sep 22 12:43:04 servername nagios[22363]: Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Sep 22 12:43:04 servername nagios[22363]: Copyright (c) 1999-2009 Ethan Galstad
Sep 22 12:43:04 servername nagios[22363]: Last Modified: 2020-04-28
Sep 22 12:43:04 servername nagios[22363]: License: GPL
Sep 22 12:43:04 servername nagios[22363]: Website: https://www.nagios.org
Sep 22 12:43:04 servername nagios[22363]: Reading configuration data...
Sep 22 12:43:04 servername nagios[22363]: Read main config file okay...


Sep 22 12:47:15 servername nagios[24331]: Checked 822 hosts.
Sep 22 12:47:15 servername nagios[24331]: Checked 33 host groups.
Sep 22 12:47:15 servername nagios[24331]: Checked 25 service groups.
Sep 22 12:47:15 servername nagios[24331]: Checked 40 contacts.
Sep 22 12:47:15 servername nagios[24331]: Checked 60 contact groups.
Sep 22 12:47:15 servername nagios[24331]: Checked 151 commands.
Sep 22 12:47:15 servername nagios[24331]: Checked 48 time periods.
Sep 22 12:47:15 servername nagios[24331]: Checked 0 host escalations.
Sep 22 12:47:15 servername nagios[24331]: Checked 0 service escalations.
Sep 22 12:47:15 servername nagios[24331]: Checking for circular paths...
Sep 22 12:47:15 servername nagios[24331]: Checked 822 hosts
Sep 22 12:47:15 servername nagios[24331]: Checked 0 service dependencies
Sep 22 12:47:15 servername nagios[24331]: Checked 0 host dependencies
Sep 22 12:47:15 servername nagios[24331]: Checked 48 timeperiods
Sep 22 12:47:15 servername nagios[24331]: Checking global event handlers...
Sep 22 12:47:15 servername nagios[24331]: Checking obsessive compulsive processor commands...
Sep 22 12:47:15 servername nagios[24331]: Checking misc settings...
Sep 22 12:47:15 servername nagios[24331]: Total Warnings: 440
Sep 22 12:47:15 servername nagios[24331]: Total Errors: 0
Sep 22 12:47:15 servername nagios[24331]: Things look okay - No serious problems were detected during the pre-flight check
Sep 22 12:47:17 servername systemd[1]: Unit nagios.service entered failed state.
Sep 22 12:47:17 servername systemd[1]: nagios.service failed.
Sep 22 12:49:36 servername systemd[1]: Starting Nagios Core 4.4.6...
Sep 22 12:49:36 servername nagios[25308]: Nagios Core 4.4.6
Sep 22 12:49:36 servername nagios[25308]: Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Sep 22 12:49:36 servername nagios[25308]: Copyright (c) 1999-2009 Ethan Galstad
Sep 22 12:49:36 servername nagios[25308]: Last Modified: 2020-04-28
Sep 22 12:49:36 servername nagios[25308]: License: GPL
Sep 22 12:49:36 servername nagios[25308]: Website: https://www.nagios.org
Sep 22 12:49:36 servername nagios[25308]: Reading configuration data...
Sep 22 12:49:36 servername nagios[25308]: Read main config file okay...
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by pbroste »

Hello @Interrex

Thanks for reaching out, and sounds like there is something wonky going on with Core Config, and want to start off by going through the database repair and then a Core reindex.
[*]Reindex the Core Configuration Manager (CCM) configs[/*]
  • rm -rf /usr/local/nagios/etc/import/*
  • 1: Terminal command list all running /bin/nagios -> ps -aux | grep -E '/bin/nagios'
  • 2: Terminal command -> killall -9 nagios (or pkill nagios)
  • 3: Terminal command check to see if /bin/nagios processes are stopped
  • 4: Restart nagios.service by terminal command: systemctl restart nagios
  • 5: Head over to the Nagios XI web console ==> Core Configuration Manager (CCM) ==> Config File Management ==> [Delete Files] ==> [Write Files] ==> [Verify Files]
  • 6: Core Configuration Manager (CCM) ==> Under Quick Tools ==> "Apply Configuration"
  • 7: Restart nagios.service by terminal command: systemctl restart nagios
  • [list]
  • Code: Select all

    systemctl restart nagios
[/list]

[*]verify that the host and services look good and verify that there are no errors by running the pre-flight check:[/*]
  • Code: Select all

    /usr/local/nagios/bin/nagios -vvv /usr/local/nagios/etc/nagios.cfg
[/list]

If issues persist please PM your updated system profile for us to review.

To send us your system profile.
  • Login to the Nagios XI GUI using a web browser.
  • Click the "Admin" > "System Profile" Menu
  • Click the "Download Profile" button
  • Save the profile.zip file and send via Private Message
Thanks,
Perry
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by pbroste »

Hello @Interrex

To follow up please run this as well:

Code: Select all

/usr/local/nagiosxi/scripts/reconfigure_nagios.sh
Thanks,
Perry
Interrex
Posts: 68
Joined: Thu May 19, 2016 8:42 am

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by Interrex »

Hi Perry.

Agree, something went realy bad some place after the upgrade and when I applied the config...
I'm still stuck with the same result after I followed your tips.

Also tried to send you my profle file, it failed with this error.

PROFILE BUILD FAILED
Array
(
)
CODE: 1

Is there' any other way to create or collect the files you need ?
I PM'd you a text file with details from View System Info if that's possible to use.

And here's the results when I followed your tips.

/usr/local/nagiosxi/scripts/repair_databases.sh

The rapair did run with success:
nagios database repair succeeded
nagiosql database repair succeeded
nagiosxi database repair succeeded

[*]Reindex the Core Configuration Manager (CCM) configs[/*]

rm -rf /usr/local/nagios/etc/import/*

There was 0 files under /usr/local/nagios/etc/import/

1: Terminal command list all running /bin/nagios -> ps -aux | grep -E '/bin/nagios'

0 processes running


2: Terminal command -> killall -9 nagios (or pkill nagios)

Did a pkill nagios just in case, killall did not work.


3: Terminal command check to see if /bin/nagios processes are stopped

0 processes running

4: Restart nagios.service by terminal command: systemctl restart nagios

Sep 23 23:07:21 servername systemd[1]: nagios.service: main process exited, code=exited, status=1/FAILURE
Sep 23 23:07:23 servername systemd[1]: Unit nagios.service entered failed state.
Sep 23 23:07:23 servername systemd[1]: nagios.service failed.

5: Head over to the Nagios XI web console ==> Core Configuration Manager (CCM) ==> Config File Management ==> [Delete Files] ==> [Write Files] ==> [Verify Files]

Did all steps, all OK

6: Core Configuration Manager (CCM) ==> Under Quick Tools ==> "Apply Configuration"

Configuration Applied.

7: Restart nagios.service by terminal command: systemctl restart nagios


root@servername:/tmp/rpms/nagiosxi $ systemctl restart nagios
root@servername:/tmp/rpms/nagiosxi $ systemctl status nagios
â nagios.service - Nagios Core 4.4.6
Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2021-09-23 23:07:23 CEST; 6s ago
Docs: https://www.nagios.org/documentation
Process: 12565 ExecStopPost=/bin/rm -f /usr/local/nagios/var/rw/nagios.cmd (code=exited, status=0/SUCCESS)
Process: 23703 ExecStop=/bin/kill -s TERM ${MAINPID} (code=exited, status=1/FAILURE)
Process: 12503 ExecStart=/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Process: 12500 ExecStartPre=/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Main PID: 12505 (code=exited, status=1/FAILURE)

Sep 23 23:07:21 servername nagios[12505]: wproc: Successfully registered manager as @wproc with query handler
Sep 23 23:07:21 servername nagios[12505]: wproc: Registry request: name=Core Worker 12506;pid=12506
Sep 23 23:07:21 servername nagios[12505]: wproc: Registry request: name=Core Worker 12507;pid=12507
Sep 23 23:07:21 servername nagios[12505]: wproc: Registry request: name=Core Worker 12508;pid=12508
Sep 23 23:07:21 servername nagios[12505]: wproc: Registry request: name=Core Worker 12509;pid=12509
Sep 23 23:07:21 servername nagios[12505]: wproc: Registry request: name=Core Worker 12510;pid=12510
Sep 23 23:07:21 servername nagios[12505]: wproc: Registry request: name=Core Worker 12511;pid=12511
Sep 23 23:07:21 servername systemd[1]: nagios.service: main process exited, code=exited, status=1/FAILURE
Sep 23 23:07:23 servername systemd[1]: Unit nagios.service entered failed state.
Sep 23 23:07:23 servername systemd[1]: nagios.service failed.


verify that the host and services look good and verify that there are no errors by running the pre-flight check:

Checked 822 hosts.
Checked 33 host groups.
Checked 25 service groups.
Checked 40 contacts.
Checked 60 contact groups.
Checked 151 commands.
Checked 48 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 822 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 48 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 777
Total Errors: 0

Things look okay - No serious problems were detected during the pre-flight check

Output from: /usr/local/nagiosxi/scripts/reconfigure_nagios.sh

--- reset_config_perms.sh ------------
> Setting script permissions
> Setting CCM script permissions
> Setting special script permissions
> Setting special component script permissions
> Setting migrate permissions
> Setting configuration file/directory permissions
> Setting perfdata directory and RRD permissions
> Setting libexec directory permissions
> Setting Nagios XI config permissions
> Setting NOM checkpoint user:group permissions
> + Setting Nagios Core corelog.newobjects user:group permissions
> + Setting CCM configuration file user:group permissions
> + Setting Recurring Downtime file user:group permissions
> + Setting BPI configuration file user:group permissions
--------------------------------------

--- ccm_import.php -------------------
> Setting import directory: /usr/local/nagios/etc/import/
> Importing config files into the CCM
No files to import
--------------------------------------

--- ccm_export.php -------------------
> Writing CCM configuration to Nagios files
Finished writing out configuraton
--------------------------------------

--------------------------------------
> Verifying configuration with Nagios Core
> Output:
Nagios Core 4.4.6
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2020-04-28
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Warning: Duplicate definition found for service 'SERVICE' on host 'HOST1' (config file '/usr/local/nagios/etc/services/HOST1.cfg', starting on line 16)

Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
Warning: Service 'Agent Versjon' on host 'HOST2' has no default contacts or contactgroups defined!

Checked 822 hosts.
Checked 33 host groups.
Checked 25 service groups.
Checked 40 contacts.
Checked 60 contact groups.
Checked 151 commands.
Checked 48 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 822 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 48 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 440
Total Errors: 0

Things look okay - No serious problems were detected during the pre-flight check
> Return Code: 0
--------------------------------------

brgds
Interrex
Interrex
Posts: 68
Joined: Thu May 19, 2016 8:42 am

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by Interrex »

Hi.

Still stuck on the same problem.

Is there some files / folder permissions I should check ?

I see this in the log for
[1632307384] Error: Could not load module '/usr/local/nagios/bin/ndomod.o' -> /usr/local/nagios/bin/ndomod.o: cannot open shared object file: No such file or directory
[1632307384] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1632307384] Error: Module loading failed. Aborting.

/usr/local/nagios/etc/nagios.cfg

Code: Select all

broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg

Code: Select all

ls -ll /usr/local/nagios/bin/
total 2028
-rwxrwxr-- 1 nagios nagios 713168 Sep  8 23:45 nagios
-rwxrwxr-- 1 nagios nagios  39640 Sep  8 23:45 nagiostats
-rwxr-xr-x 1 root   root   890904 Sep  8 23:46 ndo.so
-rwxr-xr-x 1 root   root     1083 Sep  8 23:46 ndo-startup-hash.sh
-rwxr-xr-- 1 nagios nagios  27560 Sep  8 23:46 npcd
-rwxr-xr-- 1 nagios nagios  14696 Sep  8 23:46 npcdmod.o
-rwxr-xr-x 1 root   root   231368 Sep  8 23:47 nrpe
-rwxr-xr-x 1 root   root    10663 Sep  8 23:47 nrpe-uninstall
-rwxr-xr-x 1 root   root   129272 Sep  8 23:47 nsca
I do not find any files related to the ndomod...

- interrex
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by pbroste »

Hello @Interrex

Appears to be ndo errors, let's run the upgrade on that ndo:

Code: Select all

wget http://assets.nagios.com/downloads/nagiosxi/xi-latest.tar.gz   #get latest version
tar zxf xi-latest.tar.gz      #extracting
cd nagiosxi/subcomponents/ndo   #go to ndo directory inside installer
./upgrade   #run upgrade
Run the ./install if missing component.

Bounce the nagios.service:

Code: Select all

systemctl restart nagios.service
Let me know the results,
Perry
Interrex
Posts: 68
Joined: Thu May 19, 2016 8:42 am

Re: Nagios XI - upg v5.6.14 to 5.8.6 issues

Post by Interrex »

Hi.

Keep in mind that my system is in an offline env - no internet connection.

I downloaded the xi-latest.tar

Ran the upgrade and install script as root

Code: Select all

:/tmp/rpms/nagiosxi/subcomponents/ndo $ ./upgrade
UPGRADE: NDO is being upgraded...
checking for gcc... no
checking for cc... no
checking for cl.exe... no
configure: error: in `/tmp/rpms/nagiosxi/subcomponents/ndo/ndo-3.0.7':
configure: error: no acceptable C compiler found in $PATH
See `config.log' for more details

Code: Select all

/tmp/rpms/nagiosxi/subcomponents/ndo $ ./install
INSTALL: NDO is being installed...
checking for gcc... no
checking for cc... no
checking for cl.exe... no
configure: error: in `/tmp/rpms/nagiosxi/subcomponents/ndo/ndo-3.0.7':
configure: error: no acceptable C compiler found in $PATH
See `config.log' for more details
I have PM'd you my config.log for ndo and the xi-upgrade.log when I upgraded to 5.8.6
Also managed to generate the profile.zip via command line - attached in PM.
Locked