Nagios suddenly refuses to start

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

The command gives no output on either box.

As for files, as far as I know the only thing not on the prod box is scite. But I'll have to double-check.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

The lack of anything from rpm suggests it was compiled, but that init script is non-standard. It's possible it's a different package name. What's the output of yum repolist?

I think cloning prod back to dev is probably the easiest thing to do at this point. I can send you the default init script, and you can try that, but with all those changes, I really doubt that it's going to work in prod.

Please let us know how you'd like to proceed.
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

Code: Select all

[root@devnagios ~]# yum repolist
base                                                                                                                      | 1.1 kB     00:00
extras                                                                                                                    | 2.1 kB     00:00
updates                                                                                                                   | 1.9 kB     00:00
updates/primary_db                                                                                                        | 916 kB     00:00
http://mirror.centos.org/centos/5/updates/x86_64/repodata/primary.sqlite.bz2: [Errno -1] Metadata file does not match checksum
Trying other mirror.
repo id                                                         repo name                                                                  status
base                                                            CentOS-5 - Base                                                            3,667
extras                                                          CentOS-5 - Extras                                                            266
updates                                                         CentOS-5 - Updates                                                           875
repolist: 4,808
So an interesting thing. I rolled back the dev server to a snapshot prior to when I upgraded (Nov. 10th). It's back on Nagios 3.5.1, but is exhibiting the same issue. This issue definitely did not present itself back that far.

And the problem with cloning the prod server is the dev server is configured slightly differently, and I'm not sure how that is. I'd have to seek out any configs that are different and bring them back afterwards.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

Maybe if you could explain the login process a bit more and how you are using Scite. We use Putty here. It's possible there's some sort of script running from the client side.

Do you have physical access to this machine running dev? What happens if you log in directly to the machine?
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

I currently use MobaXTerm, however prior to this I was using Putty. Scite is the text editor that the person who set this up put in on the dev server as a way to edit config files instead of having to manipulate them with vi. It opens the config file I request in scite, I make my changes, and then save it (directly to the server). Then I push it up to the git repository and run an alias that syncs all changes to the prod server and restarts the Nagios services on it.

I do have access to the VM directly and logging in on it has the same outputs and effects as using SSH.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

What's the output of find / -name nagios? Can you PM me all the files that show up? Since it's a non-default setup, I don't know what we'll see. If you want to scrub them and then post them in the forum, that will likely lead to faster resolution (but, of course, more work on your part).

I suppose you might as well go ahead and take a look at find / -name scite. If anything shows up when searching for scite, please rm -rf it. If you don't want to use vi, nano is a good replacement.
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

Code: Select all

[root@devnagios ~]# find / -name nagios
/etc/rc.d/init.d/nagios
/var/lock/subsys/nagios
/var/spool/cron/nagios
/var/spool/mail/nagios
/var/run/sudo/nagios
/usr/local/nagios
/usr/local/nagios/bin/nagios
/home/nagios
/home/nagios/check_mk/check_mk-1.2.0p4/livestatus.src/nagios
/home/nagios/nagios
/home/nagios/nagios/nagios-4.2.2/base/nagios
/home/nagios/nagios/nagios
/home/nagios/nagios/nagios/base/nagios
/dev/shm/nagios
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

That find output is very non-standard, and we do not support check_mk.

That said, can you post the output of tail -100 /usr/local/nagios/var/nagios.log

Also, https://assets.nagios.com/downloads/nag ... ptions.pdf may be of use for your own trouble-shooting.
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

I trimmed off the service and host alerts from the output:

Code: Select all

[1480615079] Warning: use_embedded_perl_implicitly is deprecated and will be removed.
[1480615079] Warning: enable_embedded_perl is deprecated and will be removed.
[1480615079] Warning: p1_file is deprecated and will be removed.
[1480615079] Warning: sleep_time is deprecated and will be removed.
[1480615079] Warning: external_command_buffer_slots is deprecated and will be removed. All commands are always processed upon arrival
[1480615079] Warning: command_check_interval is deprecated and will be removed. Commands are always handled on arrival
[1480615079] Nagios 4.2.2 starting... (PID=4979)
[1480615079] Local time is Thu Dec 01 12:57:59 EST 2016
[1480615079] LOG VERSION: 2.0
[1480615079] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1480615079] qh: core query handler registered
[1480615079] nerd: Channel hostchecks registered successfully
[1480615079] nerd: Channel servicechecks registered successfully
[1480615079] nerd: Channel opathchecks registered successfully
[1480615079] nerd: Fully initialized and ready to rock!
[1480615079] wproc: Successfully registered manager as @wproc with query handler
[1480615079] wproc: Registry request: name=Core Worker 4981;pid=4981
[1480615079] wproc: Registry request: name=Core Worker 4982;pid=4982
[1480615079] wproc: Registry request: name=Core Worker 4983;pid=4983
[1480615079] wproc: Registry request: name=Core Worker 4985;pid=4985
[1480615079] wproc: Registry request: name=Core Worker 4984;pid=4984
[1480615079] wproc: Registry request: name=Core Worker 4986;pid=4986
[1480615079] Error: Could not load module '/usr/lib/check_mk/livestatus.o' -> /usr/lib/check_mk/livestatus.o: undefined symbol: last_command_check
[1480615079] Error: Failed to load module '/usr/lib/check_mk/livestatus.o'.
[1480615079] Error: Module loading failed. Aborting.
[1480615231] Warning: use_embedded_perl_implicitly is deprecated and will be removed.
[1480615231] Warning: enable_embedded_perl is deprecated and will be removed.
[1480615231] Warning: p1_file is deprecated and will be removed.
[1480615231] Warning: sleep_time is deprecated and will be removed.
[1480615231] Warning: external_command_buffer_slots is deprecated and will be removed. All commands are always processed upon arrival
[1480615231] Warning: command_check_interval is deprecated and will be removed. Commands are always handled on arrival
[1480615231] Nagios 4.2.2 starting... (PID=5221)
[1480615231] Local time is Thu Dec 01 13:00:31 EST 2016
[1480615231] LOG VERSION: 2.0
[1480615231] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1480615231] qh: core query handler registered
[1480615231] nerd: Channel hostchecks registered successfully
[1480615231] nerd: Channel servicechecks registered successfully
[1480615231] nerd: Channel opathchecks registered successfully
[1480615231] nerd: Fully initialized and ready to rock!
[1480615231] wproc: Successfully registered manager as @wproc with query handler
[1480615231] wproc: Registry request: name=Core Worker 5225;pid=5225
[1480615231] wproc: Registry request: name=Core Worker 5222;pid=5222
[1480615231] wproc: Registry request: name=Core Worker 5226;pid=5226
[1480615231] wproc: Registry request: name=Core Worker 5223;pid=5223
[1480615231] wproc: Registry request: name=Core Worker 5227;pid=5227
[1480615231] wproc: Registry request: name=Core Worker 5228;pid=5228
[1480615231] Error: Could not load module '/usr/lib/check_mk/livestatus.o' -> /usr/lib/check_mk/livestatus.o: undefined symbol: last_command_check
[1480615231] Error: Failed to load module '/usr/lib/check_mk/livestatus.o'.
[1480615231] Error: Module loading failed. Aborting.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

Really sorry, but check_mk is not our project and we can't provide support for it. Any help we would give would basically just be googling and reading their forums - you would most likely have better luck asking them directly.

I did try to find uninstall instructions so we could get you back on track. You'll either need to uninstall on your own and submit a new log, or see if you can get help from them.

mod_gearman, though also not our product, is where we point people for process offloading.
Locked