Can no longer ACK alerts

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
bomahony
Posts: 133
Joined: Wed Jul 04, 2018 10:46 am

Can no longer ACK alerts

Post by bomahony »

Something seems to have happened one of the 4 XI installs I am working on at the moment. In one of them . I can no longer ACK any alerts since the 5th[ish - that was the last one].

It shows something about connecting to "go.nagios.com" ?

I rebooted the system and it is the same.
bomahony
Posts: 133
Joined: Wed Jul 04, 2018 10:46 am

Re: Can no longer ACK alerts

Post by bomahony »

I have another XI node that is almost identical in another DC, that works fine. I updated both to 5.5.7 today.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Can no longer ACK alerts

Post by npolovenko »

@bomahony, Please run through the following commands and let me know if that resolves the issue:
service nagios stop
service ndo2db stop
rm /usr/local/nagios/var/retention.dat
mv /usr/local/nagios/var/ndo2db.lock /usr/local/nagios/var/ndo2db.lock.bak
mv /usr/local/nagios/var/ndo.sock /usr/local/nagios/var/ndo.sock.bak
service ndo2db start
service nagios start
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bomahony
Posts: 133
Joined: Wed Jul 04, 2018 10:46 am

Re: Can no longer ACK alerts

Post by bomahony »

Will do. On leave until Tuesday, and will do it then. FYI i did reboot the VM so don't know if that would have done most of that?
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Can no longer ACK alerts

Post by npolovenko »

@bomahony, The reboot should do most of it. Except if you had a crashed ndo2db process that left ndo.sock and ndo2db.lock files behind. So If you reboot on Tuesday and the problem is still there go ahead and run these commands anyway.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bomahony
Posts: 133
Joined: Wed Jul 04, 2018 10:46 am

Re: Can no longer ACK alerts

Post by bomahony »

Same issue.
The lock & sock files didn't exist as the service shut down cleanly.

The removal of retention.dat has cleared all the previous acks. It also didn't recreate the file when i started the services.

After a reboot it seems to have rebuilt the retention.dat @35M with all the previous data? So i restarted services and deleted retention again.
I tried different browsers, and both as nagiosadmin and my own [admin] user.

However Mass ACK seems to work? [Which then recreated the retention.dat file again]
bomahony
Posts: 133
Joined: Wed Jul 04, 2018 10:46 am

Re: Can no longer ACK alerts

Post by bomahony »

Seems when I click on "Network Outages" I can get an Error "Unable to parse XML output" also.

I am going to compare permissions on two instances.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Can no longer ACK alerts

Post by npolovenko »

@bomahony, Could you send me a system profile from the problematic XI instance?
To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and send it to me in a personal message.
After you send me the profile please post something in this thread to bring it back up in the support queue.

Also, please open the /etc/init.d/nagios script and make sure that the following lines point to the files in the correct locations:
prefix=/usr/local/nagios
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosRunFile=${prefix}/var/nagios.lock
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bomahony
Posts: 133
Joined: Wed Jul 04, 2018 10:46 am

Re: Can no longer ACK alerts

Post by bomahony »

Ok. I think this may have been an old issue. I moved XI to its own FS at the start of the month. Of course I was a dope and screwed the root directory ownership. I had root:root instead of apache:nagios.

So, previously when I had a few seconds wait while it was trying to do stuff when I ACK'd , now it is immediate.
But it still doesnt ACK for some reason?

Will send the stuff on tomorrow. Just bailing out the door now!
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Can no longer ACK alerts

Post by npolovenko »

@bomahony, That's good to know. Yeah, this is likely a permissions problem then. You can run this script as root:
/usr/local/nagiosxi/scripts/reset_config_perms.sh
Also, when you come back please send the permissions for all files that I listed earlier and all the parent directories.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked