Possible Bug or Room for Improvement

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
J.A.K
Posts: 103
Joined: Wed Aug 05, 2020 11:39 am

Possible Bug or Room for Improvement

Post by J.A.K »

We do all our host\service adds via the API. Because we often add without a contact group at first we use the force parameter to add the hosts then apply config after. One issue is if you force add a host and a setting is wrong (for example a typo in a selected host group), then applying config will fail, but gives very little to explain why. The show errors will be blank:
Capture.PNG
And verifying the cfg comes back successful as well as the host will appear with the other hosts showing a status of applied. Normally we just sort by ID and delete the last few hosts, but some explanation on the issue during apply config would be very helpful.
You do not have the required permissions to view the files attached to this post.
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: Possible Bug or Room for Improvement

Post by dchurch »

When you do an Apply Config from the list, it's really performing three steps:

1. Delete Config
2. Write Config
3. Verify Config
4. Restart Nagios Core (sometimes called the Monitoring Engine)

If you go to Configure (top menu) => Core Config Manager => Tools (left menu) > Config File Management, you can trigger these steps individually and perhaps gain more insight into what went wrong with the configuration.

If you PM me a system profile, I can investigate further as to why the output was blank. It may very well be blank due to a bug we can fix.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
J.A.K
Posts: 103
Joined: Wed Aug 05, 2020 11:39 am

Re: Possible Bug or Room for Improvement

Post by J.A.K »

The Config File Management was actually where I noted I was going to verify the cfg file to see if it showed errors. Comes through clean no issues.

Would you like a system profile when it's in a broken state where apply doesn't work? Because I can create a host with a bogus hostgroup real quick if you like it's pretty easy to replicate.
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: Possible Bug or Room for Improvement

Post by dchurch »

J.A.K wrote:Would you like a system profile when it's in a broken state where apply doesn't work?
Yes. A profile would have in it the XI database entries it uses to create the cfg files, as well as logs of what went wrong (among other useful things).

Get one by going to Admin (top menu) => System Profile (in the left menu), then clicking the blue button. If you're unable to generate the the profile through the web interface, please try generating it from the command line by running these commands as root:

Code: Select all

rm -rf /usr/local/nagiosxi/var/components/profile*
/usr/local/nagiosxi/scripts/components/getprofile.sh SUPPORT
Then send me the resulting /usr/local/nagiosxi/var/components/profile.zip file.
If the profile script fails, please include the ENTIRE output.
J.A.K wrote:The Config File Management was actually where I noted I was going to verify the cfg file to see if it showed errors. Comes through clean no issues.
Weird. It may very well be a bug then, but I won't know until I lab it up using your system profile.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
J.A.K
Posts: 103
Joined: Wed Aug 05, 2020 11:39 am

Re: Possible Bug or Room for Improvement

Post by J.A.K »

Sent the system profile. Running the cmd does actually show the error in verify that the host has an incorrect template on it, but the "Show Errors" section in apply config just comes back blank when you click it after a failed apply. So just that "Show Errors" appears to be reporting nothing. (Not actually sure how it's supposed to look but I assume it should just have the output from the failed verify?)
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: Possible Bug or Room for Improvement

Post by dchurch »

Looks like it's having problems writing to the checkpoint directory. Every time you apply config, it snapshots the current config before replacing it. My guess is since permissions make it unable to do that, it fails to apply config.

What's the output from the following command?

Code: Select all

ls -la /usr/local/nagiosxi/nom/checkpoints/nagioscore{,/errors}
Everything under those directories is supposed to be owned by nagios:nagios, so you can go ahead and fix that if it's wrong.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
J.A.K
Posts: 103
Joined: Wed Aug 05, 2020 11:39 am

Re: Possible Bug or Room for Improvement

Post by J.A.K »

For some reason it looks like errors alone is set to root:root. Let me change that to nagios:nagios and try again.

Is this folder one of the ones that "File Permissions Check" in the GUI should check? Or I guess is there some way to list what folders should be set to what? Checkpoints itself is set to root:root for me but I don't want to recursive chown since I know some folders should be owned by Apache instead of nagios.
J.A.K
Posts: 103
Joined: Wed Aug 05, 2020 11:39 am

Re: Possible Bug or Room for Improvement

Post by J.A.K »

And setting errors from root to nagios fixed it.
MicrosoftTeams-image.png
Recommendation to pass back for the XI team go to black text on red instead of white. lol. I tweaked the CSS above just to show it's much more legible.
You do not have the required permissions to view the files attached to this post.
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: Possible Bug or Room for Improvement

Post by dchurch »

J.A.K wrote:For some reason it looks like errors alone is set to root:root. Let me change that to nagios:nagios and try again.
Does applying the configuration work without error now?
J.A.K wrote:Is this folder one of the ones that "File Permissions Check" in the GUI should check?
It may be advantageous to check these permissions periodically and indicate to the user, -- so in response to your question yes, probably it should.
Or at the very least, it should give a better indication as to what went wrong when applying the config.

I can submit a feature request on your behalf if you'd like. Please keep in mind that the decision to implement the enhancement is at the discretion of our development team. I know that the config application step ducks from PHP to shell to C and back, so it might not be easy from a development standpoint to get the error propagated back up to the web interface.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
J.A.K
Posts: 103
Joined: Wed Aug 05, 2020 11:39 am

Re: Possible Bug or Room for Improvement

Post by J.A.K »

Yep I mentioned in the previous post but the permissions you pointed out as incorrect fixed it once I set it back, so I believe you've got me settled.

If you could submit that as feedback absolutely. I understand it might just go on the back burner, but I think expanding that permission check to a few more areas would really help prevent situations like this for other users in the future.
Locked