Nagios Hostgroup lost half of it's devices?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Nagios Hostgroup lost half of it's devices?

Post by benhank »

Perplexing issues.
1.I had a hostgroup that contained 514 objects. Suddenly it contained 129 objects. I can verify that nobody made any changes to the system. I do know the time period in which the issue must have happened.
I have done the following:
Gotten a list of the 514 objects, re added them to the hostgroup.cfg via command line and did a

Code: Select all

service nagios restart
the objects then showed up in the hg, but then I did an "apply config", and the numbers dropped again.
So I tried again this time doing a database repair 1st. This time a whole bunch of objects I'd deleted came back in ccm as "unsync'd objects", and the issue persisted.
I then deleted the host group, and repeated the db repair and adding the objects via the command line. This time I only got 124 objects BEFORE apply config.
so I recreated the hostgroup file and tried to import it in ccm\tools \import.
nothing no import.
2. While trying to fix this, I noticed than in the legacy ccm, if I looked at a hosts properties, The host would show up as no being a member of the HG. When I looked at the same host in the new CCM, it was listed as a member of the hostgroup this was the case for about sixty of my hosts.


My concern is that even if I manually readded all 514 hosts, I don't know how the problem happed in the 1st place and if it will happen any other hg's.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios Hostgroup lost half of it's devices?

Post by mguthrie »

Gotten a list of the 514 objects, re added them to the hostgroup.cfg via command line and did a : service nagios restart
Note the bold print at the top of the .cfg files that say "Do not manually edit this config files, your changes will be overwritten by NagiosQL" ;)

You have to make the changes in the CCM, and then Apply Configuration, otherwise the hand-edits to the config files will always be overwritten every time you Apply Configuration. The only place that it's safe to edit configuration files by hand is the /usr/local/nagios/etc/static directory.
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: Nagios Hostgroup lost half of it's devices?

Post by benhank »

even if I import the files?
i can take the hit and re add all the hosts, but I am really concerned about what caused this.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios Hostgroup lost half of it's devices?

Post by mguthrie »

You probably can import those configs, that should work. I don't recommend doing it that way on a regular basis for config changes, but it should save you from having to redo that work in the CCM.
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: Nagios Hostgroup lost half of it's devices?

Post by benhank »

now, when I do an "apply config" it finishes, but when I go back to ccm/hostgroups i am still prompted to apply the config for changes to take affect.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios Hostgroup lost half of it's devices?

Post by scottwilkerson »

If you go to Admin -> System Profile

Are the system time and php time synced and correct?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: Nagios Hostgroup lost half of it's devices?

Post by benhank »

now that you mention it, no. the system and server time are synced, but both are off by 5 mins.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios Hostgroup lost half of it's devices?

Post by scottwilkerson »

do the timezones and times match in Admin -> System Profile
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: Nagios Hostgroup lost half of it's devices?

Post by benhank »

Code: Select all

Server Port: 80
Date/Time
PHP Timezone: America/New_York
PHP Time: Mon, 10 Dec 2012 13:27:15 -0500
System Time: Mon, 10 Dec 2012 13:27:15 -0500
thats what i have
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios Hostgroup lost half of it's devices?

Post by scottwilkerson »

So after you Apply Configuration from the CCM, it still says that you need to apply configuration for changes to take affect?

Can you post a screenshot?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked