Page 2 of 3
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Fri Jan 04, 2019 11:23 am
by bomahony
define service {
host_name !my-dns02
service_description DNS: Bind DNS Throughput
use scv-service
hostgroup_name dns
check_command check_nrpe!check_bind_stats!!!!!!!!
register 1
}
define service {
service_description HAProxy
use app-service
hostgroup_name HAProxy
check_command check_nrpe!check_haproxy!!!!!!!
register 1
}
Ok so the DNS one looks like it is because of the !host exclude I guess [That will have to be removed I guess]. I cannot see anything wrong with the other.
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Fri Jan 04, 2019 2:05 pm
by lmiltchev
The exclusions ("!") usually cause the import to fail. We have it mentioned in our official documentation:
example01.PNG
As far as importing the second config goes, I haven't been able to recreate the issue in house. I imported a similar conifg, but the service didn't get duplicated. Have you checked the app-service template config to make sure that there is nothing weird, e.g. exclusions?
How are you importing the config - via the GUI, API, or just dropping it in the import directory, and running the reconfigure_nagios.sh script?
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Mon Jan 07, 2019 8:32 am
by bomahony
Removed the exclusion and that check imports fine.
The app-service template is the same as 70% of the other checks [we only use 3 templates currently] which are all working fine.
I am importing via an ansible play that copies over the configs and runs the import script.
I actually just deleted the whole lot of the HAProxy checks including the original, and ran the import. First one populated. Then we got a second one again on the second import. This is the only check still banjaxxed.
I am going to have a look at another environment later today and see if it also happens there.
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Mon Jan 07, 2019 10:49 am
by lmiltchev
I am importing via an ansible play that copies over the configs and runs the import script.
Can you be more specific? Where are you copying the configs? Which import script are you running? Can you show us an example of the commands that you run from the CLI?
I am going to have a look at another environment later today and see if it also happens there.
Let us know if you are having the same issue on the second server too.
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Mon Jan 07, 2019 11:18 am
by bomahony
Code: Select all
---
- name: Copy Configs for import
copy:
src: "{{ item }}"
dest: /usr/local/nagios/etc/import/
owner: root
group: root
mode: 0755
with_fileglob: "files/nagiosxi-server/configs/*"
- name: Run config import
shell: /usr/local/nagiosxi/scripts/reconfigure_nagios.sh
Same issue on the second node [i copied the files over and ran the import the same way, although it did have the HAProxy check already in there too.....]
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Mon Jan 07, 2019 11:33 am
by bomahony
Ok now I am getting a second check duplicating on the first server.
1. Added a hostgroup that existed on the server [called SSH], but not in my configs. This hostgroup had a single check associated with it.
2. Ran import. That check duplicated.
3. Frowning, realised my mistake. Kicked myself.
4. Removed the HG from the configs. Check is still getting duplicated.
5. Ran import to see what happens. Check still duplicates.
The only thing that I think would have been different would be the Hostgroup Alias. This now shows as what I set in the config file, but perhaps there is something different in the DB? I could quite easily have changed that in the HAProxy group a while back also....
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Mon Jan 07, 2019 11:43 am
by bomahony
Actually the alias is probably unrelated. On the second node, I added the HG, ran the import [the HG didn't exist] and it is duplicating there also somehow....
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Mon Jan 07, 2019 12:22 pm
by lmiltchev
i copied the files over and ran the import the same way, although it did have the HAProxy check already in there too.....
So, you have the HAProxy service configured already, and now you are "re-importing it" (trying to overwrite the existing one?) by dropping the "new" config in the imports directory, and running the "reconfigure_nagios.sh" script, correct? The issue is that the service gets duplicated, instead of being overwritten (updated). Please, correct me if I am wrong.
Can you show the actual config of the HAProxy service, and the config that you are trying to import, along with all relevant templates, commands, etc.?
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Mon Jan 07, 2019 1:03 pm
by bomahony
Apologies for being a bit unclear. The import is the exact same as what is already there. I am trying to get a generic ansible setup working across multiple environments for ~ 50K checks. Some of these will be rebuilt on a regular basis. Also the monitoring environment is still a wIP and I am constantly getting requests for new crud. This means that the service check files [which are roughly split into hostgroups] are constantly evolving.
The original issue was every check got duplicated when I ran the import. Now it is a case of 1 check was getting duplicated [HAProxy check] for some unknown reason. This even happened when i deleted everything named HAProxy, imported the config, and then ran the import again, with no changes what soever in the file.
[For some reason the SSH one is now duplicating as well, but tbh that can wait].
This is not critically urgent. Even when duplicated, the checks dont actually run, as they just stay in CCM upon when i delete them every few runs. I would like to get to the bottom of it, but it not preventing me continuing with the environment.
I can provide you with an System Profile and a tarball of the actual config files to import if you like [albeit I am leaving the office shortly, so it will probably be Wed before I get back to this.
And thanks for all the support. The issue is about 98% resolved for me anyways.
Re: Query on fix [TPS#13871] applied in 5.8.8
Posted: Mon Jan 07, 2019 2:31 pm
by lmiltchev
Sounds good! You can PM me the profile whenever you have a chance. We will try to recreate the issue in house, and find a solution to it. Thanks!