Cut over to CentOS 7 this morning, Nagios specific errors

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Cut over to CentOS 7 this morning, Nagios specific errors

Post by rferebee »

Good morning, I cut over my Nagios XI servers from CentOS 6 to CentOS 7 this morning.

Everything went really smoothly, but I'm seeing one issue I can't figure out. Every check-in cycle these services in the picture below go critical, but as soon as I force them to check-in again they report as OK.

They keep going flipping back and forth from critical to OK, not sure why.

Any ideas I can try?
You do not have the required permissions to view the files attached to this post.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by rferebee »

Also, my server just started sending out hundreds of "Flapping Stopped" notifications. Is there any way to clear out whatever queue those are in, so they don't get sent out?
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by benjaminsmith »

Hello,

You can run the following command to clear the events queue.

Code: Select all

echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -uroot -pnagiosxi nagiosxi
Regarding the other issue, I believe the script is having trouble parsing the output from systemctl. Can you send me the profile, so I can try to verify this in the logs? Thanks.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by rferebee »

PM sent with profile, thank you.

I truncated the DBs, but it's still sending out notifications. Could the ones I'm getting be delayed from earlier in the morning? The ones I'm getting right now are shown to be from a little over a minute ago in the console.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by benjaminsmith »

Hello,

Have you turned off notifications yet? You'll want to turn off notifications and then clear the event queue.

Regarding the 'could not parse XML error', this will be patched in the next release. To correct, replace (make a backup of you existing file) the manage_services.sh script with the one attached. It's in the/usr/local/nagiosxi/scripts directory.

Once uploaded, make it executable chmod +x and change the group permissions to chown root:nagios
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by rferebee »

Do I need to restart any services after I make these changes?

I found this solution in a previous support thread:

Code: Select all

service nagios stop
service ndo2db stop
service crond stop
service postgresql restart
pkill -9 -u nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | psql nagiosxi nagiosxi
service crond start
service ndo2db start
service nagios start
service npcd restart
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by benjaminsmith »

Hello @rferebee,

In this case it shouldn't be necessary, but if you're trying to clear any alerts, it doesn't hurt to kill off all the processes and restart the services.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by rferebee »

Ok, I'll keep that in mind.

Everything appears to be stable at the moment, but if we could please keep this open for a day or so, I'd appreciate it.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by benjaminsmith »

Hello @rferebee,
Everything appears to be stable at the moment, but if we could please keep this open for a day or so, I'd appreciate it.
No problem. We'll keep this open.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cut over to CentOS 7 this morning, Nagios specific error

Post by rferebee »

Everything appears to be running smoothly this morning. No issues last night.

I do have one question though, hopefully you can assist. Prior to the cut over we would receive email notifications whenever our backup XI server would perform a failover restore. This occurs daily for us at 9AM. Well, the server is doing the failover restore, but it's not sending out the notifications telling us that's it's doing it. I must have missed a configuration file somewhere, but I'm not sure where to look.

If I PM'd you an example of the notifications we used to get, do you think you'd be able to identify what mechanism would trigger it?
Locked