Nagios stuck, won't check devices in queue

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios stuck, won't check devices in queue

Post by scottwilkerson »

cwscribner wrote:In theory, if I deleted ALL of the host configuration files then did an apply configuration, the database would propogate all of the proper config files, right?
This should be correct for everything in the hosts / services sub-folders, but NOT the static sub-folder.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Nagios stuck, won't check devices in queue

Post by cwscribner »

Okay, I think we finally have a plan of attack.

-Make a copy of the hosts directory
-Create a .txt file of the list of outstanding hosts to be removed
-Run the attached perl script to remove the devices on the list

If that works and things start processing again, we're done. Otherwise we'll

-Do a full backup via xi-backup.sh
-Remove all config files in the hosts directory
-Do an apply configuration via CCM

Any thoughts/concerns on this plan?
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios stuck, won't check devices in queue

Post by scottwilkerson »

This sounds like a good plan. You have left yourself several "outs" to at worst get back to where you are now.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Nagios stuck, won't check devices in queue

Post by cwscribner »

The config removal went fine. Cut out 1148 devices from being monitored and there haven't been any errors in terms of apply config.

Any thoughts on where to look for hints regarding this database to config-file disconnect?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios stuck, won't check devices in queue

Post by mguthrie »

I would suggest to start by running small scale tests with deleting hosts, and then writing out the changes with the Write Config Tool and see if any issues arise. Make sure the host files under /usr/local/nagios/etc/hosts are all owned as apache:nagios.
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Nagios stuck, won't check devices in queue

Post by cwscribner »

This seems to be running normally now. I've added and removed a few hosts without problems.
Locked