Ghost host and services issue

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
bcamp
Posts: 30
Joined: Thu May 23, 2013 2:54 pm

Ghost host and services issue

Post by bcamp »

Good morning, guys. I'm having a really hard time with two files being recreated whenever I click apply configuration. I've dealt with ghost hosts and services before, but it's never given me this much trouble.

A host was added a while back by somebody else on our team. I had given him a quick crash course on how to add hosts, and he added a few switches with the snmpwalk wizard. (It was our network supervisor. Very competent individual, not just some noob I was trusting to not mess up.) He added a few switches and apply the config, but now whenever I try to save new config changes, I get the following error

Code: Select all

Error: Service has no hosts and/or service_description (config file '/usr/local/nagios/etc/services/10.228.251.1.cfg', starting on line 32)
I can comment out the service starting on line 32, the error just moves down about 16 lines to the next service entry. There's 77 services in this file, so there's a lot of lines to go through. I waded through a dozen or so commenting them out, only to have the error move down to the next service each time. This by itself isn't a big deal. I don't mind deleting and readding the host. Here's where it becomes a problem, though.

In XI, I can delete all the services for this host, and I can delete the host itself. Applying the config causes the files to get re-written and the apply fails with the above error. Following the FAQ entries for ghost hosts, I did the following

Code: Select all

killall nagios
sudo rm /usr/local/nagios/etc/hosts/10.228.251.1.cfg
sudo rm /usr/local/nagios/etc/services/10.228.251.1.cfg
sudo service start nagios
After I delete the files from the cli, I then go into the CCM, and under services and hosts I delete everything there regarding this host. 77 services and 1 host are successfully deleted. After I've done that, I go into CCM -> Write Config Tool and manually write the data to file. That returns successfully, no mention of the host 10.228.251.1. Verify my config, that also comes back ok. Click Restart Nagios, it completes without error. At this point, if I click Apply Configuration, I get the error I mentioned above, and those two files I deleted via cli are both back.

I've even tried removing the physical files via cli, removing the entries in XI via CCM, write configs to file under CCM->Tools, complete server reboot, then once the server is back up, I've used a different browser than the one I removed everything in. Just trying to make sure there's nothing in cache anywhere causing things to get re-written. No matter what I do, these two files keep coming back when I click Apply Configuration.

I'm about to the point of making sure I've got current backups of all configs, Deleting all host/service configuration files under CCM->Write Config Files, and then importing all of the other configs. I really don't want to go that route if it can be avoided. Until this incident, Nagios has been working extremely well with minimal fuss, and I'm afraid doing that may just cause other issues. I've restored my oldest backup already, but either the files are just that persistent, or this guy was already added by the time that backup was made. I didn't realize there was a problem with this host until I tried to do some end of summer cleanup in Nagios a few weeks later. Is there anything else I can try instead?

CentOS 6.6
64 bit
Physical server, manual install
No special configs, nothing else running on this machine
I'm seeing two errors in error_log. I don't think they're related to this issue, but I'll let you guys determine that.

Code: Select all

[Fri Sep 11 08:28:17 2015] [error] [client 10.0.0.109] PHP Notice:  Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: https://nagios.psd.ms/nagiosxi/includes/components/ccm/xi-index.php
and

Code: Select all

[Fri Sep 11 08:46:36 2015] [error] [client ::1] PHP Notice:  Undefined index: language in /usr/local/nagiosxi/html/includes/components/ccm/includes/common_functions.inc.php on line 711
You do not have the required permissions to view the files attached to this post.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Ghost host and services issue

Post by tmcdonald »

Please run the following command as root from the Nagios server command line and post the results:

Code: Select all

cd /usr/local/nagiosxi/scripts
./reconfigure_nagios.sh | tail -40
Former Nagios employee
bcamp
Posts: 30
Joined: Thu May 23, 2013 2:54 pm

Re: Ghost host and services issue

Post by bcamp »

Thank you for your help. Here you go.

Code: Select all

[brian@nagios scripts]$ sudo ./reconfigure_nagios.sh | tail -40
[sudo] password for brian:
--2015-09-11 12:34:22--  http://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: “nagiosql.login”

    [ <=>                                                                                                                           ] 10,377      --.-K/s   in 0.04s

2015-09-11 12:34:23 (277 KB/s) - “nagiosql.login” saved [10377]

--2015-09-11 12:34:24--  http://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: “nagiosql.import.monitoring”

    [         <=>                                                                                                                   ] 49,187      38.9K/s   in 7.2s

2015-09-11 12:34:31 (6.64 KB/s) - “nagiosql.import.monitoring” saved [49187]

--2015-09-11 12:34:31--  http://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: “nagiosql.import.monitoring”

    [  <=>                                                                                                                          ] 49,071       109K/s   in 0.4s

2015-09-11 12:34:32 (109 KB/s) - “nagiosql.import.monitoring” saved [49071]

--2015-09-11 12:34:32--  http://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: “nagiosql.login”

    [ <=>                                                                                                                           ] 10,377      --.-K/s   in 0.03s

2015-09-11 12:34:33 (304 KB/s) - “nagiosql.login” saved [10377]

--2015-09-11 12:34:33--  http://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: “nagiosql.export.monitoring”

    [     <=>                                                                                                                       ] 13,391      3.82K/s   in 3.4s

2015-09-11 12:34:37 (3.82 KB/s) - “nagiosql.export.monitoring” saved [13391]

tar: Removing leading `/' from member names
RESETTING PERMS
SETUID ROOT OK
URL: http://localhost/nagiosxi/includes/components/ccm/
CMDLINE
/usr/bin/wget --save-cookies nagiosql.cookies --keep-session-cookies http://localhost/nagiosxi/includes/components/ccm/ --no-check-certificate --post-data 'submit=Login&hidelog=true&loginSubmitted=true&username=nagiosxi&password=0brdgt' -O nagiosql.loginLOGIN SUCCESSFUL!
URL: http://localhost/nagiosxi/includes/components/ccm/
CMDLINE:
/usr/bin/wget --load-cookies=nagiosql.cookies http://localhost/nagiosxi/includes/components/ccm/ --no-check-certificate --post-data 'cmd=apply&type=writeConfig' -O nagiosql.export.monitoring
WRITE CONFIGS SUCCESSFUL!
OUTPUT:
Nagios Core 4.0.8
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2014
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
Error: Service has no hosts and/or service_description (config file '/usr/local/nagios/etc/services/10.228.251.1.cfg', starting on line 48)
   Error processing object config files!


***> One or more problems was encountered while processing the config files...

     Check your configuration file(s) to ensure that they contain valid
     directives and data defintions.  If you are upgrading from a previous
     version of Nagios, you should be aware that some variables/definitions
     may have been removed or modified in this version.  Make sure to read
     the HTML documentation regarding the config files, as well as the
     'Whats New' section to find out what has changed.
RET: 1
/usr/local/nagiosxi/nom/checkpoints/nagioscore/errors /usr/local/nagiosxi/scripts
/usr/local/nagiosxi/scripts
LATEST NOM SNAPSHOT: /usr/local/nagiosxi/nom/checkpoints/nagioscore/1441974482.tar.gz
/ /usr/local/nagiosxi/scripts
RESTORING NOM SNAPSHOT : /usr/local/nagiosxi/nom/checkpoints/nagioscore/1441974482.tar.gz
/usr/local/nagiosxi/scripts
RESETTING PERMS
SETUID ROOT OK
[brian@nagios scripts]$
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Ghost host and services issue

Post by ssax »

Please go into the CCM and deactivate all the services associated with that host, then deactivate the host.

Then go to Configure > Core Config Manager > Tools > Write Config Files
- Click Delete (don't worry, it's safe, they get re-written)
- Click Write
- Now click verify and see if it verifies properly, if it does, try to apply configuration (do not try to apply config if it fails verification).

If it failed verification, now go look at the file and see what services are listed there, search for those services (maybe you have duplicate or ones with multiple hosts?).

Let us know what you find.
bcamp
Posts: 30
Joined: Thu May 23, 2013 2:54 pm

Re: Ghost host and services issue

Post by bcamp »

Hi ssax,

We might be getting somewere. I followed your instructions, when I got down to verifying after deleting and writing the files, the verification did fail. It was different this time, though. Verification failed for a different host that the same guy had set up. I went through the process of disabling all services and then that host, I repeated the whole thing a handful of times. He had added about ten hosts with services tied to each, each host that had been added needed this done to it.

I got through deactivating all the services/hosts that were throwing out errors when I clicked verify. After I deactivated each host, I finally got a good verification to return.

Code: Select all

Nagios Core 4.0.8
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2014
License: GPL

Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
Checked 1948 services.
Checked 47 hosts.
Checked 10 host groups.
Checked 9 service groups.
Checked 11 contacts.
Checked 7 contact groups.
Checked 125 commands.
Checked 20 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 47 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 20 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors: 0

Things look okay - No serious problems were detected during the pre-flight check 
When the good verification comes back, I try to apply configuration, I'm continuing to get the same error.

Code: Select all

Error: Service has no hosts and/or service_description (config file '/usr/local/nagios/etc/services/10.228.251.1.cfg', starting on line 16)
bcamp
Posts: 30
Joined: Thu May 23, 2013 2:54 pm

Re: Ghost host and services issue

Post by bcamp »

I just realized I posted this in the General Support forums, I meant for it to go in Customer Support. Any chance this can get moved? We do have a current support contract.

Mod Edit: Topic has been moved
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Ghost host and services issue

Post by ssax »

Is this one of the ones you deactivated or is it a different one? 10.228.251.1

Try doing the delete, write, and apply and see if that resolves it.

If that doesn't work, shoot an email to [email protected] with a descriptive subject and a detailed body with a link back to this thread in it so that we can setup a remote session.

Thank you
bcamp
Posts: 30
Joined: Thu May 23, 2013 2:54 pm

Re: Ghost host and services issue

Post by bcamp »

It is one of the ones I deactivated. The 251.1 host is the one that has a .cfg file under /hosts and /services that keeps reappearing when I try to apply configuration.

I just tried the delete, write, verify, apply process again with the same results. It's weird, the files for 251.1 do not get written when I press write. Even doing an ls on the cli shows the physical files are still gone. They only reappear after I click apply configuration.

I will go ahead and send an email to xisupport and let them take a look. Thank you both for your help, ssax and tmcdonald.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Ghost host and services issue

Post by tmcdonald »

bcamp wrote:I will go ahead and send an email to xisupport and let them take a look. Thank you both for your help, ssax and tmcdonald.
We'll continue this in the email, but just so you are aware @ssax and myself are members of the Support team and will see your email come in :)
Former Nagios employee
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Ghost host and services issue

Post by ssax »

Locking because this was moved to a ticket.

This turned out to be a couple files in the /usr/local/nagios/etc/import directory that kept getting re-added during every apply configuration.
Locked