We are unable to Acknowledge or Scheduled Downtime on alarms in Nagios XI 5.4.8.
Acknowledgment allows our technicians to enter a reason and submit. But no comments or acknowledgements appear in the interface under the element and the status remains as an unhandled alarm. Issuing Scheduled Downtime produces the same result. Neither of these actions affect the element (host/service) nor are the changes reflected on their respective Incident Management pages.
Even though the audit log shows these commands are issued, no action is taken. Oddly enough, we're experiencing this on more than one XI server.
Please advise,
Unable to acknowledge or schedule downtime
Unable to acknowledge or schedule downtime
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
Nagios Fusion 2024R1.0.2
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Unable to acknowledge or schedule downtime
This could be a permissions issue, or a db issue, or a performance issue...so it could be different things on different XI servers. The forum won't allow but a couple of attachments at a time. Since there are multiple XI servers involved, this might be a good candidate for a ticket: https://support.nagios.com/tickets/
Happy to try to work on this through the forum though. The first thing to try is to work through https://assets.nagios.com/downloads/nag ... tabase.pdf
If you have postgres for the nagiosxi database on any of those servers, then you'll want to run a vacuum on postgres.
Can you PM me your Profiles from the non-working systems? You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.
You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh
That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
Happy to try to work on this through the forum though. The first thing to try is to work through https://assets.nagios.com/downloads/nag ... tabase.pdf
If you have postgres for the nagiosxi database on any of those servers, then you'll want to run a vacuum on postgres.
Can you PM me your Profiles from the non-working systems? You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.
You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh
That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
Re: Unable to acknowledge or schedule downtime
I've PM you a system profile from one XI host for your review. Hopefully something stands out and we can address the other hosts one by one. Most likely a DB issue.
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
Nagios Fusion 2024R1.0.2
Re: Unable to acknowledge or schedule downtime
@dwhitfield is out of the office today and we don't have access to his personal email box so we cannot access the System Profile.
Can you post it instead?
Can you post it instead?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Unable to acknowledge or schedule downtime
I'd rather not post it publicly. I'll PM it to you momentarily. Also, FYI, your ticketing system wouldn't allow attachments with Firefox when I tried to submit that way yesterday.tgriep wrote:@dwhitfield is out of the office today and we don't have access to his personal email box so we cannot access the System Profile.
Can you post it instead?
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
Nagios Fusion 2024R1.0.2
Re: Unable to acknowledge or schedule downtime
I received the profile and put it in the share we we can access it.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Unable to acknowledge or schedule downtime
Where are we at with this support request?tgriep wrote:I received the profile and put it in the share we we can access it.
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
Nagios Fusion 2024R1.0.2
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Unable to acknowledge or schedule downtime
I notice you have postgres. Did you run the vacuum as I suggested? Although the article is about a different issue, you can find very complete instructions for running a vacuum at https://support.nagios.com/kb/article.php?id=25 if you scroll down a bit.
If the vacuum doesn't resolve the issue, please let us know.
If the vacuum doesn't resolve the issue, please let us know.
Re: Unable to acknowledge or schedule downtime
No I did not, I've been waiting for a review of the profile. To my understanding Nagios XI has three Databases associated with it's installation, nagiosxi (Postgres) being one of them. As we utilize an off-loaded MySQL database configuration for the other two, I didn't want to head down the wrong path.dwhitfield wrote:I notice you have postgres. Did you run the vacuum as I suggested? Although the article is about a different issue, you can find very complete instructions for running a vacuum at https://support.nagios.com/kb/article.php?id=25 if you scroll down a bit.
If the vacuum doesn't resolve the issue, please let us know.
In preparation to run the vacuum, I again tested the acknowledge and schedule downtime on the affected systems and these functions are working again. This raises some questions.
1. Is there a maintenance task which could have automatically ran and addressed the issue?
2. What causes the Postgres DB to behave in such a manner and why does a vacuum (reclaim storage) fix it?
3. Should we still run a vacuum on the Postgres Databases or continue to monitor the potential non-issue?
4. Can you briefly explain what is stored in the various databases (nagiosxi, ndoutils, nagiosql)?
Thank you,
Nagios XI 2024R2.2.1 (8 Servers)
Nagios Fusion 2024R1.0.2
Nagios Fusion 2024R1.0.2
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Unable to acknowledge or schedule downtime
1. Possibly. There is a dbmaint task that runs...but it runs every 5 minutes. It could be that there was a day where there were a lot of db entries, and in the intervening time that got cleared. The maintenance is not very smart. It just clears out old stuff and sometimes that fixes things "magically", like presumably you are seeing. Aside from the space issue and performance issues, that's one reason it's set to run.
2. Well, sometimes a strait vacuum doesn't fix it, which is why those instructions are so long. This is not postgres specific. In fact, mysql is much worse in this regard, or at least the MyISAM storage engine.
3. I'd just say if it gets stuck again keep it as a trouble-shooting tool. Seems like a waste of time to me if you aren't having issues.
4.
A. nagiosxi is all of the stuff to specific to XI (by which I mean, Core doesn't see it). Things like the XI users.
B. nagios is used for display.
C. nagiosql is the CCM db.
2. Well, sometimes a strait vacuum doesn't fix it, which is why those instructions are so long. This is not postgres specific. In fact, mysql is much worse in this regard, or at least the MyISAM storage engine.
3. I'd just say if it gets stuck again keep it as a trouble-shooting tool. Seems like a waste of time to me if you aren't having issues.
4.
A. nagiosxi is all of the stuff to specific to XI (by which I mean, Core doesn't see it). Things like the XI users.
B. nagios is used for display.
C. nagiosql is the CCM db.