Missing alerts

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
tonyleatwork
Posts: 91
Joined: Mon Jul 07, 2014 8:55 am

Missing alerts

Post by tonyleatwork »

Hi -

We have a check_wmi check that scans the system for all available HDD's and checks them against a WARNING/CRITICAL threshold.

One of the drives in that service group is in a WARNING status, but since it was a low priority drive we ACKNOWLEDGED the WARNING. A different drive in that service went to CRITICAL but we never received an email alert. This is odd because Nagios core did register a CRITICAL.

Is this expected behavior or did we bump into a bug or a system issue?

As you can see from the Nagios Core snapshots, the CRITICALs are registered with the system - but we never got the email.
alerthistogram.JPG
alerthistory.JPG
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Missing alerts

Post by tgriep »

Did the notifications for that service get disabled by mistake?
Could you post how the service check is configured and search the objects.cache file and post the settings for that service also?

Code: Select all

/usr/local/nagios/var/objects.cache
Be sure to check out our Knowledgebase for helpful articles and solutions!
tonyleatwork
Posts: 91
Joined: Mon Jul 07, 2014 8:55 am

Re: Missing alerts

Post by tonyleatwork »

Hi -

I think the object cache contains all the settings as far as I can see but high level:

The custom command breaks down to:

$USER1$/check_wmi_plus.pl -H $HOSTADDRESS$ -u $USER10$ -p $USER11$ -m $ARG3$ $ARG4$ $ARG5$ $ARG6$ -t 110

We nested the login + pw inside the resource.cfg since it was sensitive (WMI requires windows admin privileges to work)

Then it alerts to WSG_WARNINGS and WSG_ALERTS

WSG_WARNINGS only alerts against WARNINGs
WSG_ALERTS only alerts against CRITICALs

During this time, those two email contact groups did receive different alerts, so I dont think it was email related or configuration related.

Code: Select all

define service {
        host_name       nwd2clst11.ad.analog.com
        service_description     All Disk Usage
        check_command   check_xi_service_wmiplus_secure!!!!checkdrivesize!-a '[c-z]' -w '90' -c '95' -y 2 -t 25!!!
        contact_groups  WSG_WARNINGS,WSG_ALERTS
        notification_period     24x7
        initial_state   o
        importance      0
        check_interval  5.000000
        retry_interval  1.000000
        max_check_attempts      3
        is_volatile     0
        parallelize_check       1
        active_checks_enabled   1
        passive_checks_enabled  1
        obsess  1
        event_handler_enabled   1
        low_flap_threshold      0.000000
        high_flap_threshold     0.000000
        flap_detection_enabled  1
        flap_detection_options  a
        freshness_threshold     0
        check_freshness 0
        notification_options    a
        notifications_enabled   1
        notification_interval   60.000000
        first_notification_delay        0.000000
        stalking_options        n
        process_perf_data       1
        retain_status_information       1
        retain_nonstatus_information    1
        }

jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Missing alerts

Post by jdalrymple »

This is all based upon whether you selected the sticky ack or not. Unfortunately it's poorly documented (and the default option):
Acknowledge command from Core interface wrote:This command is used to acknowledge a service problem. When a service problem is acknowledged, future notifications about problems are temporarily disabled until the service changes from its current state. If you want acknowledgement to disable notifications until the service recovers, check the 'Sticky Acknowledgement' checkbox. Contacts for this service will receive a notification about the acknowledgement, so they are aware that someone is working on the problem. Additionally, a comment will also be added to the service. Make sure to enter your name and fill in a brief description of what you are doing in the comment field. If you would like the service comment to remain once the acknowledgement is removed, check the 'Persistent Comment' checkbox. If you do not want an acknowledgement notification sent out to the appropriate contacts, uncheck the 'Send Notification' checkbox.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Missing alerts

Post by tgriep »

Can you post the host information from the objects.cache file?
That service doesn't have a template applied to is so the service will inherit the settings from the host.
If the host doesn't have the Notification Options set to how you like, then that is why the notification didn't get sent.
Can you check the Notifications log and verify that the notification didn't happen?
Be sure to check out our Knowledgebase for helpful articles and solutions!
tonyleatwork
Posts: 91
Joined: Mon Jul 07, 2014 8:55 am

Re: Missing alerts

Post by tonyleatwork »

Where can I find the maillog? /var/log/mail just shows TO: field.

And it looks like there was a template:

xiwizard_windowswmi_service

Is that not sufficient?

Another key note is that this system DID alert back on 07/30 - and while it's possible something could've changed between then and now, I just want to understand whats going on in case this is affecting other systems.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Missing alerts

Post by tgriep »

In the Home screen in XI, under Incident Management is the Notifications, click that and see if that service sent a notification at that time.

Also, in Core Config Manager, edit that service, Click on the Alert Setting tab and setup the notification options to how you want, save it and see if that resolves it for you.
Be sure to check out our Knowledgebase for helpful articles and solutions!
tonyleatwork
Posts: 91
Joined: Mon Jul 07, 2014 8:55 am

Re: Missing alerts

Post by tonyleatwork »

You're right, this did not send out an alert it looks like. So it must've been the acknowledgement then. What I'll do is remove the host and re-add it and perform some user training.
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Missing alerts

Post by hsmith »

tonyleatwork wrote:You're right, this did not send out an alert it looks like. So it must've been the acknowledgement then. What I'll do is remove the host and re-add it and perform some user training.
Is there anything else you need help with, or am I all right to close this topic?
Former Nagios Employee.
me.
Locked