Page 1 of 2

XI upgrade failed

Posted: Tue Jan 03, 2017 3:23 pm
by tboyer
RHEL6.8 XI upgrade.

Upgraded from 5.2 to 5.4 this afternoon, and it looks like it's not starting. No error messages in upgrade; no error messages in nagios.log - service nagios start returns nothing.

Code: Select all

[root@boy-adams ~]# service nagios start                                                                                                                                                                                         
Starting nagios: done.
[root@boy-adams ~]# service nagios status                                                                                                                                                                                        
nagios is not running
[root@boy-adams ~]# 
I noticed that there were no nagios processes running, so - as user nagios - I tried starting some in non-daemon mode:

Code: Select all

[root@boy-adams var]# su - nagios
[nagios@boy-adams ~]$ /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.2.4
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 12-07-2016
License: GPL

Website: https://www.nagios.org
Nagios 4.2.4 starting... (PID=19309)
Local time is Tue Jan 03 15:16:33 EST 2017
nerd: Channel hostchecks registered successfully
nerd: Channel servicechecks registered successfully
nerd: Channel opathchecks registered successfully
nerd: Fully initialized and ready to rock!
wproc: Successfully registered manager as @wproc with query handler
wproc: Registry request: name=Core Worker 19310;pid=19310
wproc: Registry request: name=Core Worker 19311;pid=19311
wproc: Registry request: name=Core Worker 19312;pid=19312
wproc: Registry request: name=Core Worker 19313;pid=19313
wproc: Registry request: name=Core Worker 19314;pid=19314
wproc: Registry request: name=Core Worker 19318;pid=19318
wproc: Registry request: name=Core Worker 19315;pid=19315
wproc: Registry request: name=Core Worker 19317;pid=19317
wproc: Registry request: name=Core Worker 19316;pid=19316
Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
WARNING: Extinfo objects are deprecated and will be removed in future versions
Segmentation fault (core dumped)
[nagios@boy-adams ~]$
_Something_ is segfaulting, but I've got no idea what or how to find out what. updgrade.log is attached, and any pointers appreciated.

Re: XI upgrade failed

Posted: Tue Jan 03, 2017 3:30 pm
by avandemore
What is the output from:

Code: Select all

$ /usr/local/nagios/bin/nagios -vvv /usr/local/nagios/etc/nagios.cfg

Re: XI upgrade failed

Posted: Tue Jan 03, 2017 3:33 pm
by tboyer
<snip lots of deprecated>

Warning: Host 'localhost' has no services associated with it!
Checked 823 hosts.
Checked 213 host groups.
Checked 16 service groups.
Checked 57 contacts.
Checked 25 contact groups.
Checked 312 commands.
Checked 66 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 823 hosts
Checked 2022 service dependencies
Checked 0 host dependencies
Checked 66 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 30
Total Errors: 0

Things look okay - No serious problems were detected during the pre-flight check
[root@boy-adams /]#

Re: XI upgrade failed

Posted: Tue Jan 03, 2017 3:59 pm
by tgriep
Can you try starting the nagios process using the normal way,

Code: Select all

service nagios start
Then check the following log files to see if there are more details on why it is not starting

Code: Select all

/var/log/messages
/usr/local/nagios/var/nagios.log
Can you post your /usr/local/nagios/etc/nagios.cfg file so we can view it?

Re: XI upgrade failed

Posted: Tue Jan 03, 2017 4:04 pm
by tboyer
Nothing that I can see:

[root@boy-adams ~]# service nagios start
Starting nagios: done.

Messages:

Jan 3 16:02:05 boy-adams ndo2db: Successfully connected to MySQL database
Jan 3 16:02:05 boy-adams ndo2db: Successfully connected to MySQL database
Jan 3 16:02:05 boy-adams snmptt-sys[19709]: SNMPTT v1.4beta2 shutdown
Jan 3 16:02:05 boy-adams snmptt-sys[19709]: Total traps received=0,Total traps translated=0,Total traps ignored=0,Total unknown traps=0
Jan 3 16:02:05 boy-adams ndo2db: Trimming timedevents.
Jan 3 16:02:05 boy-adams ndo2db: Trimming systemcommands.
Jan 3 16:02:05 boy-adams ndo2db: Trimming servicechecks.
Jan 3 16:02:05 boy-adams ndo2db: Trimming hostchecks.
Jan 3 16:02:05 boy-adams ndo2db: Trimming eventhandlers.
Jan 3 16:02:06 boy-adams snmptt-sys[643]: SNMPTT v1.4beta2 started
Jan 3 16:02:06 boy-adams snmptt-sys[643]: Loading /etc/snmp/snmptt.conf
Jan 3 16:02:06 boy-adams snmptt-sys[643]: Finished loading 1868 lines from /etc/snmp/snmptt.conf
Jan 3 16:02:06 boy-adams snmptt-sys[647]: Changing to UID: snmptt (494)
Jan 3 16:02:07 boy-adams ndo2db: Successfully disconnected from MySQL database

Jan 3 16:02:05 boy-adams nagios: Nagios 4.2.4 starting... (PID=604)
Jan 3 16:02:05 boy-adams nagios: Local time is Tue Jan 03 16:02:05 EST 2017
Jan 3 16:02:05 boy-adams nagios: LOG VERSION: 2.0
Jan 3 16:02:05 boy-adams nagios: qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
Jan 3 16:02:05 boy-adams nagios: qh: core query handler registered
Jan 3 16:02:05 boy-adams nagios: nerd: Channel hostchecks registered successfully
Jan 3 16:02:05 boy-adams nagios: nerd: Channel servicechecks registered successfully
Jan 3 16:02:05 boy-adams nagios: nerd: Channel opathchecks registered successfully
Jan 3 16:02:05 boy-adams nagios: nerd: Fully initialized and ready to rock!
Jan 3 16:02:05 boy-adams nagios: wproc: Successfully registered manager as @wproc with query handler
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 605;pid=605
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 606;pid=606
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 607;pid=607
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 609;pid=609
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 610;pid=610
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 613;pid=613
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 614;pid=614
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 611;pid=611
Jan 3 16:02:05 boy-adams nagios: wproc: Registry request: name=Core Worker 608;pid=608
Jan 3 16:02:05 boy-adams nagios: ndomod: NDOMOD 2.1.2 (11-14-2016) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 3 16:02:05 boy-adams nagios: ndomod: Successfully connected to data sink. 0 queued items to flush.
Jan 3 16:02:05 boy-adams nagios: ndomod registered for process data
Jan 3 16:02:05 boy-adams nagios: ndomod registered for log data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for system command data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for event handler data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for notification data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for comment data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for downtime data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for flapping data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for program status data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for host status data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for service status data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for adaptive program data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for adaptive host data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for adaptive service data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for external command data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for aggregated status data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for retention data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for contact data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for contact notification data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for acknowledgement data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for state change data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for contact status data'
Jan 3 16:02:05 boy-adams nagios: ndomod registered for adaptive contact data'
Jan 3 16:02:05 boy-adams nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions
Jan 3 16:02:05 boy-adams nagios: WARNING: Extinfo objects are deprecated and will be removed in future versions

Re: XI upgrade failed

Posted: Tue Jan 03, 2017 4:16 pm
by ssax
Please run through the upgrade again, it doesn't look like it completed successfully, you should see this at the bottom:

Code: Select all

Nagios XI Upgrade Complete!
---------------------------


You can access the Nagios XI web interface by visiting:
    http://X.X.X.X/nagiosxi/
Please send us any errors that you receive as well (some don't get put into the log file) as the upgrade.log file again.


Thank you

Re: XI upgrade failed

Posted: Wed Jan 04, 2017 8:55 am
by tboyer
So first, script bug. Upgrade hit this:

chown: cannot access `/usr/local/nagios/etc/services/*.cfg': No such file or directory

and died. I 'fixed' with this:

[root@boy-adams nagiosxi]# mkdir -p /usr/local/nagios/etc/services/
[root@boy-adams nagiosxi]# touch /usr/local/nagios/etc/services/dummy.cfg


and re-ran the upgrade:

Things look okay - No serious problems were detected during the pre-flight check
RET: 0
Running configuration check...
Stopping nagios:/etc/init.d/nagios: line 143: kill: (22300) - No such process
done.
Starting nagios: done.
Fixing php-mcrypt bug...
Stopping httpd: [ OK ]
Starting httpd: [ OK ]
connect() timed out!

Nagios XI Upgrade Complete!
---------------------------


You can access the Nagios XI web interface by visiting:
http://172.27.17.133/nagiosxi/


However, same issue - it's immediately failing.

[root@boy-adams nagiosxi]# service nagios start
Starting nagios: done.
[root@boy-adams nagiosxi]# service nagios status
nagios is not running
[root@boy-adams nagiosxi]#

[root@boy-adams nagiosxi]# /etc/init.d/nagios restart
Running configuration check...
Stopping nagios:/etc/init.d/nagios: line 143: kill: (27047) - No such process
done.
Starting nagios: done.
[root@boy-adams nagiosxi]# /etc/init.d/nagios status
nagios is not running
[root@boy-adams nagiosxi]#



No errors in either messages or nagios.log.

Re: XI upgrade failed

Posted: Wed Jan 04, 2017 12:01 pm
by tboyer
Cheated, and it's working. Copied all of the files in /usr/local/nagios/bin back in. So working, but not healthy, I suspect.

Re: XI upgrade failed

Posted: Wed Jan 04, 2017 3:02 pm
by dwhitfield
Any modifications will be overwritten the next to you upgrade, but plenty of people mod their XI and just script re-doing them each time they upgrade

We can dig a little bit more now, or we can just leave this open in case something crops up that seems to be related.

If you want to go the digging route...can you PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the Download Profile button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info).

After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.

Re: XI upgrade failed

Posted: Thu Jan 05, 2017 6:18 pm
by tboyer
Thanks - for now,we'll just go with this. I'll be prepared when I do the next upgrade, though. :)