Improper alerts/notifications

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Improper alerts/notifications

Post by rkennedy »

Looking at your host hyper-vhost1, it has the IP 10.x.x.11 assigned, is that the new IP or the old IP? Can you show a screenshot of the notification emails you're referring to as well? Now that we have your objects.cache it'll help to line things up.

Looking at 'carbackup1' I see this notification firing -

Code: Select all

May 31 08:57:17 nagiosxi nagios: HOST NOTIFICATION: nagiosadmin;carbackup1.x.x;DOWN;xi_host_notification_handler;CRITICAL - 10.x.x.66: Host unreachable @ 10.x.x.10. rta nan, lost 100%
But, I don't see a corresponding host in your cache at all. Is this the one you were referring to? I did find 'carbackup14', but it doesn't appear to be the same one.
Former Nagios Employee
CarlWedu
Posts: 51
Joined: Fri Aug 21, 2015 8:17 am

Re: Improper alerts/notifications

Post by CarlWedu »

That is the new IP.

Re: carbackup1, that is correct. It has been completely removed from CCM, applied and it still generates an alert.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Improper alerts/notifications

Post by rkennedy »

I wonder if you have multiple nagios processes running, what is the full output of ps -ef on the system?
Former Nagios Employee
CarlWedu
Posts: 51
Joined: Fri Aug 21, 2015 8:17 am

Re: Improper alerts/notifications

Post by CarlWedu »

ps -ef
You do not have the required permissions to view the files attached to this post.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Improper alerts/notifications

Post by rkennedy »

It looks like you have multiple nagios processes running, so you may want to kill off one of them and start it up again manually.

Code: Select all

nagios   64556     1  0 May26 ?        00:51:26 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   56637     1  0 Apr22 ?        06:10:42 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
This would explain the 'haunting' of the old alerts. Once you have just the active process running, things should work as expected.
Former Nagios Employee
CarlWedu
Posts: 51
Joined: Fri Aug 21, 2015 8:17 am

Re: Improper alerts/notifications

Post by CarlWedu »

killed all but one of those processes. made a ccm change and applied. more nagios processes started during that so i killed the other old one as well. now have just the self-started new ones:

[root@nagiosxi ~]# ps -ef | grep /usr/local/nagios/etc/nagios.cfg
root 11864 12259 0 15:55 pts/0 00:00:00 grep /usr/local/nagios/etc/nagios.cfg
nagios 53829 1 0 15:46 ? 00:00:03 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 53880 53829 0 15:46 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg


still seeing alerts on the operations screen for devices that have been removed from ccm, like the backup1. did notice that the notifications for that are going to a contact that doesnt exist.

***

will check again in the morning to see if alerts have resolved themselves. thanks!
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Improper alerts/notifications

Post by rkennedy »

There will be child processes that started (which are ok), but if the PPID isn't equal to the PID for the single Nagios process that's when I'd be worried.

If you're still seeing issues in the morning, please post the output of ps -ef once again.
Former Nagios Employee
CarlWedu
Posts: 51
Joined: Fri Aug 21, 2015 8:17 am

Re: Improper alerts/notifications

Post by CarlWedu »

last notification for "backup1" was 2016-06-01 15:18:12 and operations screen looks correct so far this morning.


what would cause additional instances of the nagios process to be started like that?
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Improper alerts/notifications

Post by rkennedy »

To be honest, it's hard to say. It could have been from a multitude of different things happening. Most of the time relating to a time when the nagios service would be stopped / started.

I've seen it happen in the past because the old process couldn't be killed for whatever reason, and then a new one spawns.
Former Nagios Employee
CarlWedu
Posts: 51
Joined: Fri Aug 21, 2015 8:17 am

Re: Improper alerts/notifications

Post by CarlWedu »

thank you!!

/resolved
Locked