Configuration Discrepancies
Configuration Discrepancies
I'm struggling to get our Nagios system to apply config changes & it appears to be hanging on to servers which have been removed...
For example, I deleted several switches from the Nagios config via the XI GUI and they're no longer searchable in the CCM, but they still show up in the running system, mostly going critical/down because we moved them to internal IP ranges and nagios can no longer see them.
There's also a server which I deleted & remains doggedly in the config but not manageable via the GUI. It's got 24 services all critical, so is very annoying to our out of hours guys, who keep seeing all the red and panicing.
How do I get around this sort of thing?
Due to another weird configuration verification error, I just checked the services in CCM and weirdly there's several of the same service, with no hosts assigned and I don't remember putting them there - we like to have a global service check, that's capable of being applied to a hostgroup or individual hosts, not a service for each host.
Looking at the CoreDNS_53 service there, that was what was causing my config verification to bork and I had to edit and save it, in order to get the config verification to finally go green, but that's when I started to poke around and found the multiple identical service checks shown above.
I am a bit worried that it's got itself (or been helped) into a mess that I don't know how to untangle. Currently, when I apply the config, it works (goes green and applies), but I still have the legacy hosts lingering. Any ideas?
For example, I deleted several switches from the Nagios config via the XI GUI and they're no longer searchable in the CCM, but they still show up in the running system, mostly going critical/down because we moved them to internal IP ranges and nagios can no longer see them.
There's also a server which I deleted & remains doggedly in the config but not manageable via the GUI. It's got 24 services all critical, so is very annoying to our out of hours guys, who keep seeing all the red and panicing.
How do I get around this sort of thing?
Due to another weird configuration verification error, I just checked the services in CCM and weirdly there's several of the same service, with no hosts assigned and I don't remember putting them there - we like to have a global service check, that's capable of being applied to a hostgroup or individual hosts, not a service for each host.
Looking at the CoreDNS_53 service there, that was what was causing my config verification to bork and I had to edit and save it, in order to get the config verification to finally go green, but that's when I started to poke around and found the multiple identical service checks shown above.
I am a bit worried that it's got itself (or been helped) into a mess that I don't know how to untangle. Currently, when I apply the config, it works (goes green and applies), but I still have the legacy hosts lingering. Any ideas?
You do not have the required permissions to view the files attached to this post.
Re: Configuration Discrepancies
What version of XI are you on?
For the lingering hosts/services, they are called Ghost Hosts and are pretty easy to remove:
http://support.nagios.com/wiki/index.ph ... t_Hosts.29
For the duplicate services with no assigned hosts, are there hosts assigned via templates applied to the services?
For the lingering hosts/services, they are called Ghost Hosts and are pretty easy to remove:
http://support.nagios.com/wiki/index.ph ... t_Hosts.29
For the duplicate services with no assigned hosts, are there hosts assigned via templates applied to the services?
Former Nagios employee
Re: Configuration Discrepancies
We're on: Nagios XI 2012R2.9
This issue has survived the several reboots I've performed and I just did "killall nagios ; killall nagios" (to be sure), then "service nagios start" and it continues to be an issue.
I can also see the config files for the deleted hosts in /usr/local/nagios/etc/
This issue has survived the several reboots I've performed and I just did "killall nagios ; killall nagios" (to be sure), then "service nagios start" and it continues to be an issue.
I can also see the config files for the deleted hosts in /usr/local/nagios/etc/
Code: Select all
# ll -R /usr/local/nagios/etc/|grep -i server03.hq
-rw-rw-r-- 1 apache nagios 1237 Jul 24 10:36 server03.hq.inty.net.cfg
Re: Configuration Discrepancies
OK, so I just did this (for the chance to roll back my next crazy action, just in case of borkage...): -
Then I did this: -
I no longer see the "ghost host" that's in that particular config file.
In fact, I just hunted out the config files for the switches I had removed via the GUI and binned them too. Now I am ghostless & when I attempt to verify and update the config via the GUI, it's all green.
I'm not sure if what I have done is right. I think Whitney sang a song about it...
Code: Select all
root@Nagios:/usr/local/nagios/etc/hosts
# cat server03.hq.inty.net.cfg
Code: Select all
root@Nagios:/usr/local/nagios/etc/hosts
# rm -vf server03.hq.inty.net.cfg
# killall nagios && service nagios restart
In fact, I just hunted out the config files for the switches I had removed via the GUI and binned them too. Now I am ghostless & when I attempt to verify and update the config via the GUI, it's all green.
I'm not sure if what I have done is right. I think Whitney sang a song about it...
Re: Configuration Discrepancies
You handled this situation appropriately, unlike the unsavory characters Whitney describes in her song.chrisp wrote:I'm not sure if what I have done is right. I think Whitney sang a song about it...
Former Nagios employee
Re: Configuration Discrepancies
Good news about the ghost hosts, but I am increasingly worried about some strange service relationships and wonder if the config is somehow corrupted. Here's the latest thing to catch my eye, while diagnosing unexpected Nagios Notification behaviour (I'm not even sure where to start diagnosing and correcting this): -
You do not have the required permissions to view the files attached to this post.
Re: Configuration Discrepancies
The host you have blurred out simply has some pings attached to it (quite a few, oddly enough). You cannot delete a host that has services still associated with it, so you will need to either delete or deactivate the services before you can delete the host.
Former Nagios employee
Re: Configuration Discrepancies
So sorry, I wasn't as clear as I could have been (my car's brakes had failed earlier in the day, leaving me a bit... spaced out).
I was looking at the host and saw that it had a bunch of pings associated, but we (whenever possible) take advantage of Nagios template inheritance, so there actually is only 1 legitimate "Ping" service in our config, which is assigned nostly by means of HostGroup membership and in the odd occurrence, to individual hosts.
I was just using the Database Relationships information button like a little window on the config, so I could show you the oddness in a single screenshot. I am now becoming aware of many more (but not all) services, that have multiple duplicates like this, where there used to be a single service, there's now 8 identical services!
I'm not sure how to proceed. Do I delete all but one of the services and then try to confirm that all the host-service relationships are as they should be? It's going to be a bit of a slog![Sad :(](./images/smilies/icon_e_sad.gif)
I was looking at the host and saw that it had a bunch of pings associated, but we (whenever possible) take advantage of Nagios template inheritance, so there actually is only 1 legitimate "Ping" service in our config, which is assigned nostly by means of HostGroup membership and in the odd occurrence, to individual hosts.
I was just using the Database Relationships information button like a little window on the config, so I could show you the oddness in a single screenshot. I am now becoming aware of many more (but not all) services, that have multiple duplicates like this, where there used to be a single service, there's now 8 identical services!
I'm not sure how to proceed. Do I delete all but one of the services and then try to confirm that all the host-service relationships are as they should be? It's going to be a bit of a slog
![Sad :(](./images/smilies/icon_e_sad.gif)
Re: Configuration Discrepancies
Can you post the config (save button next to service config) for the service Ping-Ping?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Configuration Discrepancies
Do you mean this?
BTW, for this "Ping" service, I have deleted the 7 duplicates, to see if anything exploded... AFAIK, it didn't.
Code: Select all
###############################################################################
#
# Service configuration file
#
# Created by: Nagios QL Version 3.0.3
# Date: 2014-08-07 22:09:43
# Version: Nagios 3.x config file
#
# --- DO NOT EDIT THIS FILE BY HAND ---
# Nagios QL will overwite all manual settings during the next update
#
###############################################################################
define service {
host_name parkgroup.nagios.inty.com,sven-birmingham.nagios.inty.com,sven-guildford-internet.nagios.inty.com,sven-guildford-vpn.nagios.inty.com,sven-london.nagios.inty.com
service_description Ping
display_name Ping
check_command check_ping!200.0,5%!1000.0,80%!!!!!!
max_check_attempts 5
check_interval 2
retry_interval 2
active_checks_enabled 1
check_period xi_timeperiod_24x7
check_freshness 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
notification_interval 0
notification_period xi_timeperiod_24x7
notification_options c,r,s,
notifications_enabled 1
contact_groups CG_Automated_Ticketing,CG_Infrastructure_Team,CG_Operations_Team,CG_Out_Of_Hours
register 1
}
###############################################################################
#
# Service configuration file
#
# END OF FILE
#
###############################################################################