Notification retention

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
tviset
Posts: 7
Joined: Tue Dec 03, 2019 9:30 am

Notification retention

Post by tviset »

Hi all,

I think I made a mistake in my notification settings. I have a fully passive Nagios installation, meaning that all checks are done through NCPA on other servers than the server running Nagios. The results of these checks are sent to the Nagios host every 30 seconds.

If a service is critical or in warning, I want Nagios to send an email notification. I think I have set some sort of interval somewhere because Nagios is sending an email exactly every 12 minutes, but browsing through the notification database reveals that the notifications I receive today were already created at December 14, 2019. So it seems that the notifications occur extremely fast and Nagios can not send them that fast, and holds a notification retention.

- What is a good balance between checking on an NCPA controlled server, and sending notifications?
- Can I empty the notifications message queue? At this moment, I do not need the remaining messages as we are still testing Nagios, it's not a live situation now.

Thanks, greetz, Theo
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Notification retention

Post by scottwilkerson »

Are these services still in a non-OK state?

Nagios can/will re-notify based in the notification interval specified for the service (or in an underlying template)

Can you share a service definition for one of these that is sending notifications every 12 minutes?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
tviset
Posts: 7
Joined: Tue Dec 03, 2019 9:30 am

Re: Notification retention

Post by tviset »

Hi Scot,

Thank you for your reply.

Yes, the servers are in non-OK state, or to be more specific, the servers are currently switched off.
Since the servers are passive only, this means that Nagios does not receive any info from these at the moment.

The logging shows this:
[1578998130] Warning: The results of service 'Disk Usage' on host 'helfidsp01' are stale by 0d 0h 1m 0s (threshold=0d 0h 10m 0s). I'm forcing an immediate check of the service.
...

I defined Freshness in the services in Nagios.

Hence, Nagios should currently not send the Critical/OK/Warning emails, since all servers are switched off, so the NCPA agents are not sending any information to the Nagios server. Based on the logging, Nagios isn't receiving any info, since the logging says that the services are stale.
However, i am still receiving these emails, which is why I thought there would be some sort of retention going on. I have checked the emails, the content of the emails and the notifications in Nagios that have these same details in them.

I have included GIF's for
- The emails in 12 minutes intervals in Outlook and the details in these emails
Notifications-email-interval-12Mins-details-ProcCount.GIF
- The corresponding notifications in Nagios (that's how I know that the notifications are old)
Notifications-Nagios-GUI-14122020.GIF
- Current service status in Nagios - No data
Services-P01-StatusNoInfo.GIF
Since the servers are passive only, Nagios cannot force a check. I would think that with freshness configured, the service would not flag as ok, as basically it is not ok as there is no info from the NCPA agent. But that configuration is probably something for a separate post.

Greetz, Theo
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Notification retention

Post by scottwilkerson »

tviset wrote: Since the servers are passive only, Nagios cannot force a check. I would think that with freshness configured, the service would not flag as ok, as basically it is not ok as there is no info from the NCPA agent. But that configuration is probably something for a separate post.
This depends on what you have set as the check command as with freshness enabled it will actively run whatever is in the check command at the freshness threshold (even if active checks are not enabled)
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
tviset
Posts: 7
Joined: Tue Dec 03, 2019 9:30 am

Re: Notification retention

Post by tviset »

Hello Scott,

I think the freshness configuration and forced check are for a separate post.
My issue in this post is the notifications that are still going on, although the servers have been switched off now for 10 days.
I still receive the notifications of mid-December last year.

- How can I configure the notifications and email messaging interval to a good balance, and
- How can I cancel the sending of the current, old notifications?

I currently still receive 120 emails per day regarding problems that occurred a month ago, so I really want to cancel these notifications. I rebooted the Nagios host, but that didn't help.

Best regards, Theo
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Notification retention

Post by scottwilkerson »

tviset wrote:I currently still receive 120 emails per day regarding problems that occurred a month ago, so I really want to cancel these notifications.
What is the state of these services?

If it is a non-OK state, you can acknowledge the problem to stop receiving notifications.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
tviset
Posts: 7
Joined: Tue Dec 03, 2019 9:30 am

Re: Notification retention

Post by tviset »

Hi Scott,

I have checked the services, and currently, they have an OK status and I can't acknowledge anything.
I have switched all notifications off for all services but I still receive the emails every 12 minutes.
Notification-Services-Off.GIF
How can I force Nagios to stop sending old notifications?

Greetz, Theo
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Notification retention

Post by scottwilkerson »

The email you showed in this post have current dates
https://support.nagios.com/forum/viewto ... 73#p301478

Is the settings you showed above for the same service?

What is the output of the following command from the CLI?

Code: Select all

ps -ef|grep nagios.cfg
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
tviset
Posts: 7
Joined: Tue Dec 03, 2019 9:30 am

Re: Notification retention

Post by tviset »

The output of the ps command is:
nagios 28833 1 0 Jan16 ? 00:00:17 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 28850 28833 0 Jan16 ? 00:00:05 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

And indeed are the dates current, which is part of the problem. I am sure that the problem that is mentioned in the email is about a month old, as I see that in the notifications list in the Nagios GUI. However, the email is constructed now and apparently, in the email, Nagios displays the date of creating the email, not of the date of creating the notification. To be sure of that I changed the email template a couple of days ago and the following notification emails used that changed template. The emails are created now but the notification is of a month ago. So, I can only assume that the emails will keep coming regardless of the fact that the servers are switched off at the moment.

I have rebooted the Nagios server, I have set notifications for all services to off, but it didn't help.
However, I think that the settings were different in December, which is why all the notifications are already created and sent from a queue, as it seems to me.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Notification retention

Post by scottwilkerson »

If you have them all off and are still getting messages, it could be something astuck in the DB

You can run the following to clear all the events (notifications) from the queue
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | psql nagiosxi nagiosxi
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked