Something seems to have happened one of the 4 XI installs I am working on at the moment. In one of them . I can no longer ACK any alerts since the 5th[ish - that was the last one].
It shows something about connecting to "go.nagios.com" ?
I rebooted the system and it is the same.
Can no longer ACK alerts
Re: Can no longer ACK alerts
I have another XI node that is almost identical in another DC, that works fine. I updated both to 5.5.7 today.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Can no longer ACK alerts
@bomahony, Please run through the following commands and let me know if that resolves the issue:
service nagios stop
service ndo2db stop
rm /usr/local/nagios/var/retention.dat
mv /usr/local/nagios/var/ndo2db.lock /usr/local/nagios/var/ndo2db.lock.bak
mv /usr/local/nagios/var/ndo.sock /usr/local/nagios/var/ndo.sock.bak
service ndo2db start
service nagios start
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Can no longer ACK alerts
Will do. On leave until Tuesday, and will do it then. FYI i did reboot the VM so don't know if that would have done most of that?
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Can no longer ACK alerts
@bomahony, The reboot should do most of it. Except if you had a crashed ndo2db process that left ndo.sock and ndo2db.lock files behind. So If you reboot on Tuesday and the problem is still there go ahead and run these commands anyway.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Can no longer ACK alerts
Same issue.
The lock & sock files didn't exist as the service shut down cleanly.
The removal of retention.dat has cleared all the previous acks. It also didn't recreate the file when i started the services.
After a reboot it seems to have rebuilt the retention.dat @35M with all the previous data? So i restarted services and deleted retention again.
I tried different browsers, and both as nagiosadmin and my own [admin] user.
However Mass ACK seems to work? [Which then recreated the retention.dat file again]
The lock & sock files didn't exist as the service shut down cleanly.
The removal of retention.dat has cleared all the previous acks. It also didn't recreate the file when i started the services.
After a reboot it seems to have rebuilt the retention.dat @35M with all the previous data? So i restarted services and deleted retention again.
I tried different browsers, and both as nagiosadmin and my own [admin] user.
However Mass ACK seems to work? [Which then recreated the retention.dat file again]
Re: Can no longer ACK alerts
Seems when I click on "Network Outages" I can get an Error "Unable to parse XML output" also.
I am going to compare permissions on two instances.
I am going to compare permissions on two instances.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Can no longer ACK alerts
@bomahony, Could you send me a system profile from the problematic XI instance?
Also, please open the /etc/init.d/nagios script and make sure that the following lines point to the files in the correct locations:
After you send me the profile please post something in this thread to bring it back up in the support queue.To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and send it to me in a personal message.
Also, please open the /etc/init.d/nagios script and make sure that the following lines point to the files in the correct locations:
prefix=/usr/local/nagios
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosRunFile=${prefix}/var/nagios.lock
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Can no longer ACK alerts
Ok. I think this may have been an old issue. I moved XI to its own FS at the start of the month. Of course I was a dope and screwed the root directory ownership. I had root:root instead of apache:nagios.
So, previously when I had a few seconds wait while it was trying to do stuff when I ACK'd , now it is immediate.
But it still doesnt ACK for some reason?
Will send the stuff on tomorrow. Just bailing out the door now!
So, previously when I had a few seconds wait while it was trying to do stuff when I ACK'd , now it is immediate.
But it still doesnt ACK for some reason?
Will send the stuff on tomorrow. Just bailing out the door now!
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Can no longer ACK alerts
@bomahony, That's good to know. Yeah, this is likely a permissions problem then. You can run this script as root:
Also, when you come back please send the permissions for all files that I listed earlier and all the parent directories./usr/local/nagiosxi/scripts/reset_config_perms.sh
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.