Page 1 of 1

NagiosXI keeps crashing

Posted: Thu Mar 17, 2016 10:34 pm
by spcmidrange
Hey All

Long time nagios user, but this one is been crazy. Was previously running Nagios XI 2012R2 i believe and recently every night nagios core portion would shutdown "Caught Sigterm" as you can see below here

Code: Select all

Mar 17 21:00:21 srv-reg-nagxi-01 nagios: Caught SIGTERM, shutting down...
Mar 17 21:00:21 srv-reg-nagxi-01 nagios: Successfully shutdown... (PID=18889)
Mar 17 21:00:21 srv-reg-nagxi-01 nagios: Event broker module 'NERD' deinitialized successfully.
Mar 17 21:00:21 srv-reg-nagxi-01 nagios: ndomod: Shutdown complete.
Mar 17 21:00:21 srv-reg-nagxi-01 nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Mar 17 21:00:23 srv-reg-nagxi-01 nagios: Nagios 4.1.1 starting... (PID=28835)
Mar 17 21:00:23 srv-reg-nagxi-01 nagios: Local time is Thu Mar 17 21:00:23 CST 2016
Mar 17 21:00:23 srv-reg-nagxi-01 nagios: LOG VERSION: 2.0
Mar 17 21:00:23 srv-reg-nagxi-01 nagios: qh: Failed to register socket with io broker: Invalid argument; errno=22: Invalid argument
Mar 17 21:00:23 srv-reg-nagxi-01 nagios: Error: Failed to initialize query handler. Aborting
I upgraded to Nagios XI 5 and the issue hasnt gone away. I always have to go in and do an apply configuration to start it going again. I enabled debugging and i got the debug file and well as the nagios log, which doesnt point out anything really to me

Server stats:

Code: Select all

[root@srv-reg-nagxi-01 var]# uname -a
Linux srv-reg-nagxi-01 2.6.32-220.17.1.el6.x86_64 #1 SMP Thu Apr 26 13:37:13 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@srv-reg-nagxi-01 var]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.6 (Santiago)
Manual Xi install.

I checked /var/log/messages and dmesg and nothing stands out as out of the norm at the times of the crash

If you need any more info or files, please let me know!

Re: NagiosXI keeps crashing

Posted: Fri Mar 18, 2016 12:21 pm
by spcmidrange
Hi All

Looks like we have narrowed down the problem, but just dont have a solution. Every night we run a backup via the Netbackup system. When a backup starts, it calls a script called bpstart_notify. bpstart_notify is just a wrapper to run scripts prior to the backup. In there we run /usr/local/nagiosxi/scripts/backup_xi.sh to do its backup before the entire OS being backed up. Well since the update, the new backup_xi.sh script restarts nagios, and on start it fails. the main error being:

Code: Select all

Mar 17 21:00:23 srv-reg-nagxi-01 nagios: qh: Failed to register socket with io broker: Invalid argument; errno=22: Invalid argument
Mar 17 21:00:23 srv-reg-nagxi-01 nagios: Error: Failed to initialize query handler. Aborting
running /usr/local/nagiosxi/scripts/backup_xi.sh via Crontab, or by hand does not have an issue restarting nagios. running it as root or nagios user also doesnt have a problem. but running it via the bpstart_notify gives us the above error.

If you want to mark this as closed go ahead, as the problem is not a Nagios issue, but how the script is called via Netbackup. Our solution right now is to remove the nagios restart from the backup_xi.sh script and crontab it up 1 min before the backup. If you have any suggestions on how to get this going properly i would be glad to hear them!!

Cheers!

Re: NagiosXI keeps crashing

Posted: Fri Mar 18, 2016 12:33 pm
by lmiltchev
If you want to mark this as closed go ahead, as the problem is not a Nagios issue...
Done.