Page 1 of 1

5.8.1 service check oddness

Posted: Fri Apr 16, 2021 10:50 am
by gwakem
Hi all,
XI 5.8.1 on RHEL 7.7, the situation I am seeing here is as follows, (this one is new to me):

Host and service checks on server A and server B
Host checks on Server X (a VIP that floats between server A and B)
I moved the service checks from server A to server X, applied.

Here's the weirdness:
Inside CCM Services: Server A no longer shows the service checks that were moved, cool. Server X now shows them in the CCM Services area. Great!
Used the "View Config" button on Server X's services: everything looks good, four moved service checks.
Used the "View Config" button on Server A's services: Exactly as I would expect, none of the previously moved service checks.
In the web interface, the service checks still show up under server A. They ALSO show up under server X. .....what.

I have checked the Host definitions to ensure all hosts (A, B,and X) are looking at the correct IP addresses and are labelled correctly in both the hosts and services areas.
The web interface simply disagrees with this. I am wondering if something got "stuck" in the database somewhere.

What information can I provide to help troubleshoot this issue?

Re: 5.8.1 service check oddness

Posted: Fri Apr 16, 2021 3:50 pm
by dchurch
If you PM me a system profile I can diagnose further. Get one by going to Admin (top menu) => System Profile (in the left menu), then clicking the blue button.

If you're unable to generate the the profile through the web interface, please try generating it from the command line by running these commands as root:

Code: Select all

rm -rf /usr/local/nagiosxi/var/components/profile*
/usr/local/nagiosxi/scripts/components/getprofile.sh SUPPORT
Then send me the resulting /usr/local/nagiosxi/var/components/profile.zip file.
If the profile script fails, please include the ENTIRE output.

Re: 5.8.1 service check oddness

Posted: Mon Apr 19, 2021 7:31 am
by gwakem
I PM'd over the web interface zip. Let me know if I can provide anything else to assist.

Re: 5.8.1 service check oddness

Posted: Mon Apr 19, 2021 11:06 am
by dchurch
Profile received!

...That's a lot of hosts. In your previous message, what were the actual names of "Server A," "Server B," and "Server X"? You can PM them to me if the server names are sensitive information.

Re: 5.8.1 service check oddness

Posted: Tue Apr 20, 2021 6:42 am
by gwakem
Sent! Thanks!

Re: 5.8.1 service check oddness

Posted: Tue Apr 20, 2021 12:08 pm
by dchurch
In the database dump (included in the profile you sent), I didn't find what you described:
gwakem wrote:I moved the service checks from server A to server X, applied.
What I found was only ONE service applied to Server X, and 34 applied to Server A. This makes me thing the system is glitching out when applying your configuration. The database doesn't look like it's having any errors so I can't blame that.

Instead of using Apply Config, if you go to to Config (top menu) -> Core Config Manager => Config File Management (left menu), then running Delete, Write, then Verify often will reveal more problems with better error messages. This is tremendously helpful when diagnosing config problems. (This is actually the process that the "Apply Config" button does under the hood.)

I've also used this method on my own servers to apply a stubborn config change that stubbornly wouldn't apply when I used Apply Configuration.

Re: 5.8.1 service check oddness

Posted: Mon Apr 26, 2021 6:30 am
by gwakem
It has been nearly 10 years since I used that feature, and I had forgotten it existed since everythign has run so smoothly. Thank you for pointing me at that. For anyone who finds this with the same issue in the future, the root cause was service groups that were hanging on to the definitions that got moved. Once I corrected those, (as pointed to by the delete && write && verify steps,) everything cleared up.

As always, I appreciate all of your help!

This can be locked.

Re: 5.8.1 service check oddness

Posted: Mon Apr 26, 2021 7:14 am
by scottwilkerson
gwakem wrote:It has been nearly 10 years since I used that feature, and I had forgotten it existed since everythign has run so smoothly. Thank you for pointing me at that. For anyone who finds this with the same issue in the future, the root cause was service groups that were hanging on to the definitions that got moved. Once I corrected those, (as pointed to by the delete && write && verify steps,) everything cleared up.

As always, I appreciate all of your help!

This can be locked.
Locking thread