Configuration Discrepancies

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
chrisp
Posts: 71
Joined: Fri Dec 28, 2012 11:35 am

Configuration Discrepancies

Post by chrisp »

I'm struggling to get our Nagios system to apply config changes & it appears to be hanging on to servers which have been removed...

For example, I deleted several switches from the Nagios config via the XI GUI and they're no longer searchable in the CCM, but they still show up in the running system, mostly going critical/down because we moved them to internal IP ranges and nagios can no longer see them.
2014-07-30 12_26_43-Nagios XI.png
2014-07-30 12_33_08-Nagios XI - Configuration.png
There's also a server which I deleted & remains doggedly in the config but not manageable via the GUI. It's got 24 services all critical, so is very annoying to our out of hours guys, who keep seeing all the red and panicing.

How do I get around this sort of thing?

Due to another weird configuration verification error, I just checked the services in CCM and weirdly there's several of the same service, with no hosts assigned and I don't remember putting them there - we like to have a global service check, that's capable of being applied to a hostgroup or individual hosts, not a service for each host.
2014-07-30 12_39_53-Nagios XI - Configuration.png
Looking at the CoreDNS_53 service there, that was what was causing my config verification to bork and I had to edit and save it, in order to get the config verification to finally go green, but that's when I started to poke around and found the multiple identical service checks shown above.

I am a bit worried that it's got itself (or been helped) into a mess that I don't know how to untangle. Currently, when I apply the config, it works (goes green and applies), but I still have the legacy hosts lingering. Any ideas?
You do not have the required permissions to view the files attached to this post.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Configuration Discrepancies

Post by tmcdonald »

What version of XI are you on?

For the lingering hosts/services, they are called Ghost Hosts and are pretty easy to remove:

http://support.nagios.com/wiki/index.ph ... t_Hosts.29

For the duplicate services with no assigned hosts, are there hosts assigned via templates applied to the services?
Former Nagios employee
User avatar
chrisp
Posts: 71
Joined: Fri Dec 28, 2012 11:35 am

Re: Configuration Discrepancies

Post by chrisp »

We're on: Nagios XI 2012R2.9

This issue has survived the several reboots I've performed and I just did "killall nagios ; killall nagios" (to be sure), then "service nagios start" and it continues to be an issue.

I can also see the config files for the deleted hosts in /usr/local/nagios/etc/

Code: Select all

# ll -R /usr/local/nagios/etc/|grep -i server03.hq      
-rw-rw-r-- 1 apache nagios 1237 Jul 24 10:36 server03.hq.inty.net.cfg
User avatar
chrisp
Posts: 71
Joined: Fri Dec 28, 2012 11:35 am

Re: Configuration Discrepancies

Post by chrisp »

OK, so I just did this (for the chance to roll back my next crazy action, just in case of borkage...): -

Code: Select all

root@Nagios:/usr/local/nagios/etc/hosts
# cat server03.hq.inty.net.cfg 
Then I did this: -

Code: Select all

root@Nagios:/usr/local/nagios/etc/hosts
# rm -vf server03.hq.inty.net.cfg
# killall nagios && service nagios restart
I no longer see the "ghost host" that's in that particular config file.

In fact, I just hunted out the config files for the switches I had removed via the GUI and binned them too. Now I am ghostless & when I attempt to verify and update the config via the GUI, it's all green.

I'm not sure if what I have done is right. I think Whitney sang a song about it...
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Configuration Discrepancies

Post by tmcdonald »

chrisp wrote:I'm not sure if what I have done is right. I think Whitney sang a song about it...
You handled this situation appropriately, unlike the unsavory characters Whitney describes in her song.
Former Nagios employee
User avatar
chrisp
Posts: 71
Joined: Fri Dec 28, 2012 11:35 am

Re: Configuration Discrepancies

Post by chrisp »

Good news about the ghost hosts, but I am increasingly worried about some strange service relationships and wonder if the config is somehow corrupted. Here's the latest thing to catch my eye, while diagnosing unexpected Nagios Notification behaviour (I'm not even sure where to start diagnosing and correcting this): -
NagiosXI_Config_Borked.png
You do not have the required permissions to view the files attached to this post.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Configuration Discrepancies

Post by tmcdonald »

The host you have blurred out simply has some pings attached to it (quite a few, oddly enough). You cannot delete a host that has services still associated with it, so you will need to either delete or deactivate the services before you can delete the host.
Former Nagios employee
User avatar
chrisp
Posts: 71
Joined: Fri Dec 28, 2012 11:35 am

Re: Configuration Discrepancies

Post by chrisp »

So sorry, I wasn't as clear as I could have been (my car's brakes had failed earlier in the day, leaving me a bit... spaced out).

I was looking at the host and saw that it had a bunch of pings associated, but we (whenever possible) take advantage of Nagios template inheritance, so there actually is only 1 legitimate "Ping" service in our config, which is assigned nostly by means of HostGroup membership and in the odd occurrence, to individual hosts.

I was just using the Database Relationships information button like a little window on the config, so I could show you the oddness in a single screenshot. I am now becoming aware of many more (but not all) services, that have multiple duplicates like this, where there used to be a single service, there's now 8 identical services!

I'm not sure how to proceed. Do I delete all but one of the services and then try to confirm that all the host-service relationships are as they should be? It's going to be a bit of a slog :(
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Configuration Discrepancies

Post by abrist »

Can you post the config (save button next to service config) for the service Ping-Ping?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
chrisp
Posts: 71
Joined: Fri Dec 28, 2012 11:35 am

Re: Configuration Discrepancies

Post by chrisp »

Do you mean this?

Code: Select all

###############################################################################
#
# Service configuration file
#
# Created by: Nagios QL Version 3.0.3
# Date:	      2014-08-07 22:09:43
# Version:    Nagios 3.x config file
#
# --- DO NOT EDIT THIS FILE BY HAND --- 
# Nagios QL will overwite all manual settings during the next update
#
###############################################################################

define service {
	host_name			parkgroup.nagios.inty.com,sven-birmingham.nagios.inty.com,sven-guildford-internet.nagios.inty.com,sven-guildford-vpn.nagios.inty.com,sven-london.nagios.inty.com
	service_description		Ping
	display_name			Ping
	check_command			check_ping!200.0,5%!1000.0,80%!!!!!!
	max_check_attempts		5
	check_interval			2
	retry_interval			2
	active_checks_enabled		1
	check_period			xi_timeperiod_24x7
	check_freshness			1
	process_perf_data		1
	retain_status_information	1
	retain_nonstatus_information	1
	notification_interval		0
	notification_period		xi_timeperiod_24x7
	notification_options		c,r,s,
	notifications_enabled		1
	contact_groups			CG_Automated_Ticketing,CG_Infrastructure_Team,CG_Operations_Team,CG_Out_Of_Hours
	register			1
	}	

###############################################################################
#
# Service configuration file
#
# END OF FILE
#
###############################################################################
BTW, for this "Ping" service, I have deleted the 7 duplicates, to see if anything exploded... AFAIK, it didn't.
Locked