Page 1 of 2

DB not synced issue

Posted: Fri Apr 29, 2011 12:21 pm
by niebais
OS: Centos 5.5
Nagios XI version: Nagios XI 2009R1.4B
Mysql Version: mysql Ver 14.14 Distrib 5.1.52, for redhat-linux-gnu (i686) using readline 5.1

Ok,
Here's a bug we found on our systems. A while back, we went through and made about 250 simultaneous changes and then click on "Apply configuration". Since then we have had issues where hosts will mysteriously have some of the old names left like this

Old name: australia test.ourdomain.com Ping Check
New Name: test.ourdomain.com

Several people have complained about getting notifications with the old name and I did a query in the DB to see what was happening: (db nagios)
select * from nagios_hosts where display_name like '%Ping%'\G;
Sure enough, there are about 230 hosts with the words "Ping Check" still in them when they have been all removed.

Is there some way I can clean this up easily, or some way I can resync?

Re: DB not synced issue

Posted: Fri Apr 29, 2011 2:01 pm
by admin
A few questions that could help us troubleshoot this...

1. Do these hosts show up in the CCM?

2. Are there multiple (parent) Nagios processes running?

3. Even if hosts/services were deleted from the CCM, the config files that were generated from the CCM database may not have been removed. If that's the case, you should find config files containing the old host/service names in the following directories:

/usr/local/nagios/etc/hosts
/usr/local/nagios/etc/services

If the hosts/services are indeed removed from the CCM database, you should be able to safely remove the "orphaned" config files in these directories and restart Nagios Core.

Re: DB not synced issue

Posted: Fri Apr 29, 2011 2:18 pm
by niebais
admin wrote:A few questions that could help us troubleshoot this...
1. Do these hosts show up in the CCM?
Yes, but they don't have the "Ping Check" associated with it.
admin wrote:2. Are there multiple (parent) Nagios processes running?
No, there are not. Also, this problem has been around for about 2 months now.
admin wrote:3. Even if hosts/services were deleted from the CCM, the config files that were generated from the CCM database may not have been removed. If that's the case, you should find config files containing the old host/service names in the following directories:

/usr/local/nagios/etc/hosts
/usr/local/nagios/etc/services
I removed all the orphaned config files prior to posting this. In fact, I removed all the services and hosts (using rm *.cfg in both directories) and applied our configuration again. I'm posting because the database still has a ton of "Ping checks" when none of our .cfg files have them in it.

The ping checks should have all been removed one day when we changed 275 hosts and then click on "apply configuration".

One clarification, we're not trying to delete the ping checks either, we're trying to clean up the "Ping check" from our host names, but somehow it's still there.

Re: DB not synced issue

Posted: Fri Apr 29, 2011 4:21 pm
by mguthrie
Which database are you seeing these removed checks in? The 'nagiosql' database is managed by the Core Config Manager, while the 'nagios' database stores status info and is managed by ndoutils.

Re: DB not synced issue

Posted: Mon May 02, 2011 9:48 am
by niebais
mguthrie wrote:Which database are you seeing these removed checks in? The 'nagiosql' database is managed by the Core Config Manager, while the 'nagios' database stores status info and is managed by ndoutils.
It's the "nagios" database. It's not that they are removed, the problem is that the aliases were never changed when we did a massive "apply configuration".

These two fields still contain old values in the database, but not in files:
alias: myserver ping check
display_name: myserver ping check

When we do queries through the API it pulls these values and notifications also use these values. It's a problem because we can't figure out how to remove it either.

Re: DB not synced issue

Posted: Mon May 02, 2011 4:03 pm
by mguthrie
Could you run the following query to see if there are duplicate host entries with those names? If you're able to dump your output to a text file and PM or email it to us that would be helpful as well.

select display_name from nagios_hosts;


I'll have to do some digging on why that didn't update in ndoutils correctly...

Re: DB not synced issue

Posted: Thu May 26, 2011 11:27 am
by admin
Can you provide us with the variables you're using for your notifications?

The display_name field in NDOUtils is currently not exposed to notification variables. However, the host name and alias/description are - using the %hostname% and %hostalias% notification variables.

We'll modify the default notification commands to allow for the %hostdisplayname% variable to be used in future releases.

Re: DB not synced issue

Posted: Fri May 27, 2011 9:59 am
by niebais
We haven't changed any of the notification variables yet, it should all be the standard setup.

Re: DB not synced issue

Posted: Fri May 27, 2011 10:26 am
by admin
Understood - the standard setup doesn't even support referencing the display_name in notification messages. You can use the host alias field in notification messages. We're updating XI to include support for using the display_name variable in notification messages in the next release.

For the current issue you're having, make sure an old alias/description isn't set for the problematic hosts in the CCM.

Re: DB not synced issue

Posted: Tue May 31, 2011 11:55 am
by niebais
admin wrote:Understood - the standard setup doesn't even support referencing the display_name in notification messages. You can use the host alias field in notification messages. We're updating XI to include support for using the display_name variable in notification messages in the next release.

For the current issue you're having, make sure an old alias/description isn't set for the problematic hosts in the CCM.
Yeah, there's no old aliases or descriptions. I've been over that. I sent the configuration up and there's no indication there's a problem in that area.