Host and services still showing but don't exist in CCM - av6

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
av6233
Posts: 6
Joined: Tue May 26, 2015 11:26 am

Host and services still showing but don't exist in CCM - av6

Post by av6233 »

I'm currently running XI version 5.2.0 and started experiencing the same behavior after data in /usr/local/nagios/spool/perfdata filled the disk in my XI VM. After moving the data to /var and dropping in a symlink to make things match up, I was able to get things going again and rolled back to a previous config snapshot.

When I tried to delete a switch I had retired after the snapshot, I saw the behavior Louis described. I deleted the device from CCM, validated the config and restarted. However, while the switch was gone from CCM, flat files, and Nagios Core, XI still saw the switch and generated host down notifications. Likewise, a possibly unrelated NRPE problem on the host itself generated notifications from Core, but XI showed everything normal.

The wiki suggests trying this: https://support.nagios.com/wiki/index.p ... _Hosts.29. While the process lets me confirm the device has been removed from Core, it doesn't clean up XI in my case.

In an attempt to clear the switch from XI, I tweaked the deadpool settings to make sure it was included for cleanup. When deadpool.php runs, it skips the switch and logs this message:

Code: Select all

PROCESSING HOSTS...
Processing host 'edge-2-sw.xxx.xxx.com' in stage 2
Error: Could not get ID for host 'edge-2-sw.xxx.xxx.com' - skipping
PROCESSED HOSTS:
I don't market myself as a DBA, but it looks like the host was partially deleted (thus, no primary key, or "ID"), but the main record was wiped out before all of its relationships were cleaned up, leaving the ghost. And, since I'm not a DBA, I'm not going to dig into the schema and drop any row in any table that include a reference to the retired switch. Not sure where to go from here. I could probably reinitialize the whole works and import my flat files, but I'd like to keep my RRD's for history.

Moderator Edit: Link to original thread: https://support.nagios.com/forum/viewto ... 373#158001
I'm going to split this post in two so we can address the two of you separately. Please in the future if you are having a similar issue, create a new thread and link to the related one instead of posting directly in the related thread.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Host and services still showing but don't exist in CCM

Post by tgriep »

Lets repair the mysql database and restart the services to see if that resolves the problem. Run the following in a shell as root on the XI server.

Code: Select all

cd /usr/local/nagiosxi/scripts
./repair_databases.sh
service nagios stop
killall -9 nagios
service ndo2db stop
service mysqld restart
service ndo2db start
service nagios start
Check and see if the hosts and services are in the CCM and if they are, delete the services and then the hosts and see if the apply config works to remove them.
Be sure to check out our Knowledgebase for helpful articles and solutions!
av6233
Posts: 6
Joined: Tue May 26, 2015 11:26 am

Re: Host and services still showing but don't exist in CCM

Post by av6233 »

tgriep wrote:Check and see if the hosts and services are in the CCM and if they are, delete the services and then the hosts and see if the apply config works to remove them.
No change. The host is not present in CCM, Core or flat files, but still exists in XI's host detail lists. Oddly, both the last check and next scheduled check times reflect the date when the switch was powered off.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Host and services still showing but don't exist in CCM

Post by Box293 »

Honestly, I'm pretty sure upgrading to XI 5 (or 2014R2.7) should fix your issue. XI 2012 uses Core 3.x and it wasn't as strict with it's configs.

Is there any possibility to upgrade?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
av6233
Posts: 6
Joined: Tue May 26, 2015 11:26 am

Re: Host and services still showing but don't exist in CCM

Post by av6233 »

Box293 wrote:Is there any possibility to upgrade?
I'm already on XI 5.2.0 with Core 4.1.1. I upgraded weeks ago.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Host and services still showing but don't exist in CCM

Post by lmiltchev »

Go to the CCM->Tools->Write Config Files, then click on "Delete", "Write", and "Verify" buttons (in the exact same order!), and Apply Configuration. Check to see if the host/services still show in the GUI.
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Host and services still showing but don't exist in CCM

Post by tgriep »

Another place to look for that host is in the retention.dat file. It may have been corrupted when the system's drive filled up.
Here is where the file is located on the system.

Code: Select all

/usr/local/nagios/var/retention.dat
Open it up and search for that host. If it is in there, stop the nagios process

Code: Select all

service nagios stop
remove the entries from that file and save it.
Restart nagios and see if the host is gone.

Code: Select all

service nagios start
Let us know what you find.
Be sure to check out our Knowledgebase for helpful articles and solutions!
av6233
Posts: 6
Joined: Tue May 26, 2015 11:26 am

Re: Host and services still showing but don't exist in CCM

Post by av6233 »

Rewriting the config files had no effect.

There were no references to the host in retention.dat

The host no longer appears in Nagios Core-- it only hangs around in the XI frontend (i.e.: /nagiosxi/includes/components/xicore/status.php).
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Host and services still showing but don't exist in CCM

Post by Box293 »

Lets look in the database. Using this command I can find the host with the alias win2008r2-01, can you run it for your host name and see if it appears. Make sure the cAsE is correct. You may need to use display_name instead of alias.

Code: Select all

echo "select * from nagios_hosts where alias like 'win2008r2-01' \G;" | mysql -pnagiosxi nagios
Please post the output.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
Louis
Posts: 11
Joined: Wed Jan 15, 2014 7:23 pm
Location: Perth, Australia
Contact:

Re: Host and services still showing but don't exist in CCM

Post by Louis »

tgriep wrote:Another place to look for that host is in the retention.dat file. It may have been corrupted when the system's drive filled up.
Here is where the file is located on the system.

Code: Select all

/usr/local/nagios/var/retention.dat
Open it up and search for that host. If it is in there, stop the nagios process

Code: Select all

service nagios stop
remove the entries from that file and save it.
Restart nagios and see if the host is gone.

Code: Select all

service nagios start
Let us know what you find.
So that worked but the moment I applied config in CCM it came back.
Box293 wrote:Lets look in the database. Using this command I can find the host with the alias win2008r2-01, can you run it for your host name and see if it appears. Make sure the cAsE is correct. You may need to use display_name instead of alias.

Code: Select all

echo "select * from nagios_hosts where alias like 'win2008r2-01' \G;" | mysql -pnagiosxi nagios
Please post the output.
[root@nagios ~]# echo "select * from nagios_hosts where alias like 'Brisbane CORE Router' \G;" | mysql -pnagiosxi nagios
*************************** 1. row ***************************
host_id: 91939
instance_id: 1
config_type: 1
host_object_id: 227
alias: Brisbane CORE Router
display_name: Brisbane CORE Router
address: <redacted>
check_command_object_id: 54
check_command_args: 3000.0!80%!5000.0!100%
eventhandler_command_object_id: 0
eventhandler_command_args:
notification_timeperiod_object_id: 115
check_timeperiod_object_id: 115
failure_prediction_options:
check_interval: 2
retry_interval: 1
max_check_attempts: 3
first_notification_delay: 0
notification_interval: 60
notify_on_down: 1
notify_on_unreachable: 1
notify_on_recovery: 1
notify_on_flapping: 1
notify_on_downtime: 1
stalk_on_up: 0
stalk_on_down: 0
stalk_on_unreachable: 0
flap_detection_enabled: 1
flap_detection_on_up: 1
flap_detection_on_down: 1
flap_detection_on_unreachable: 1
low_flap_threshold: 0
high_flap_threshold: 0
process_performance_data: 1
freshness_checks_enabled: 0
freshness_threshold: 0
passive_checks_enabled: 1
event_handler_enabled: 1
active_checks_enabled: 1
retain_status_information: 1
retain_nonstatus_information: 1
notifications_enabled: 1
obsess_over_host: 1
failure_prediction_enabled: 1
notes:
notes_url:
action_url:
icon_image: switch.png
icon_image_alt:
vrml_image:
statusmap_image: switch.png
have_2d_coords: 0
x_2d: -1
y_2d: 0
have_3d_coords: 0
x_3d: 0
y_3d: 0
z_3d: 0
Locked