Hi -
We have a check_wmi check that scans the system for all available HDD's and checks them against a WARNING/CRITICAL threshold.
One of the drives in that service group is in a WARNING status, but since it was a low priority drive we ACKNOWLEDGED the WARNING. A different drive in that service went to CRITICAL but we never received an email alert. This is odd because Nagios core did register a CRITICAL.
Is this expected behavior or did we bump into a bug or a system issue?
As you can see from the Nagios Core snapshots, the CRITICALs are registered with the system - but we never got the email.
Missing alerts
-
tonyleatwork
- Posts: 91
- Joined: Mon Jul 07, 2014 8:55 am
Missing alerts
You do not have the required permissions to view the files attached to this post.
Re: Missing alerts
Did the notifications for that service get disabled by mistake?
Could you post how the service check is configured and search the objects.cache file and post the settings for that service also?
Could you post how the service check is configured and search the objects.cache file and post the settings for that service also?
Code: Select all
/usr/local/nagios/var/objects.cacheBe sure to check out our Knowledgebase for helpful articles and solutions!
-
tonyleatwork
- Posts: 91
- Joined: Mon Jul 07, 2014 8:55 am
Re: Missing alerts
Hi -
I think the object cache contains all the settings as far as I can see but high level:
The custom command breaks down to:
$USER1$/check_wmi_plus.pl -H $HOSTADDRESS$ -u $USER10$ -p $USER11$ -m $ARG3$ $ARG4$ $ARG5$ $ARG6$ -t 110
We nested the login + pw inside the resource.cfg since it was sensitive (WMI requires windows admin privileges to work)
Then it alerts to WSG_WARNINGS and WSG_ALERTS
WSG_WARNINGS only alerts against WARNINGs
WSG_ALERTS only alerts against CRITICALs
During this time, those two email contact groups did receive different alerts, so I dont think it was email related or configuration related.
I think the object cache contains all the settings as far as I can see but high level:
The custom command breaks down to:
$USER1$/check_wmi_plus.pl -H $HOSTADDRESS$ -u $USER10$ -p $USER11$ -m $ARG3$ $ARG4$ $ARG5$ $ARG6$ -t 110
We nested the login + pw inside the resource.cfg since it was sensitive (WMI requires windows admin privileges to work)
Then it alerts to WSG_WARNINGS and WSG_ALERTS
WSG_WARNINGS only alerts against WARNINGs
WSG_ALERTS only alerts against CRITICALs
During this time, those two email contact groups did receive different alerts, so I dont think it was email related or configuration related.
Code: Select all
define service {
host_name nwd2clst11.ad.analog.com
service_description All Disk Usage
check_command check_xi_service_wmiplus_secure!!!!checkdrivesize!-a '[c-z]' -w '90' -c '95' -y 2 -t 25!!!
contact_groups WSG_WARNINGS,WSG_ALERTS
notification_period 24x7
initial_state o
importance 0
check_interval 5.000000
retry_interval 1.000000
max_check_attempts 3
is_volatile 0
parallelize_check 1
active_checks_enabled 1
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options a
freshness_threshold 0
check_freshness 0
notification_options a
notifications_enabled 1
notification_interval 60.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
}
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Missing alerts
This is all based upon whether you selected the sticky ack or not. Unfortunately it's poorly documented (and the default option):
Acknowledge command from Core interface wrote:This command is used to acknowledge a service problem. When a service problem is acknowledged, future notifications about problems are temporarily disabled until the service changes from its current state. If you want acknowledgement to disable notifications until the service recovers, check the 'Sticky Acknowledgement' checkbox. Contacts for this service will receive a notification about the acknowledgement, so they are aware that someone is working on the problem. Additionally, a comment will also be added to the service. Make sure to enter your name and fill in a brief description of what you are doing in the comment field. If you would like the service comment to remain once the acknowledgement is removed, check the 'Persistent Comment' checkbox. If you do not want an acknowledgement notification sent out to the appropriate contacts, uncheck the 'Send Notification' checkbox.
Re: Missing alerts
Can you post the host information from the objects.cache file?
That service doesn't have a template applied to is so the service will inherit the settings from the host.
If the host doesn't have the Notification Options set to how you like, then that is why the notification didn't get sent.
Can you check the Notifications log and verify that the notification didn't happen?
That service doesn't have a template applied to is so the service will inherit the settings from the host.
If the host doesn't have the Notification Options set to how you like, then that is why the notification didn't get sent.
Can you check the Notifications log and verify that the notification didn't happen?
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
tonyleatwork
- Posts: 91
- Joined: Mon Jul 07, 2014 8:55 am
Re: Missing alerts
Where can I find the maillog? /var/log/mail just shows TO: field.
And it looks like there was a template:
xiwizard_windowswmi_service
Is that not sufficient?
Another key note is that this system DID alert back on 07/30 - and while it's possible something could've changed between then and now, I just want to understand whats going on in case this is affecting other systems.
And it looks like there was a template:
xiwizard_windowswmi_service
Is that not sufficient?
Another key note is that this system DID alert back on 07/30 - and while it's possible something could've changed between then and now, I just want to understand whats going on in case this is affecting other systems.
Re: Missing alerts
In the Home screen in XI, under Incident Management is the Notifications, click that and see if that service sent a notification at that time.
Also, in Core Config Manager, edit that service, Click on the Alert Setting tab and setup the notification options to how you want, save it and see if that resolves it for you.
Also, in Core Config Manager, edit that service, Click on the Alert Setting tab and setup the notification options to how you want, save it and see if that resolves it for you.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
tonyleatwork
- Posts: 91
- Joined: Mon Jul 07, 2014 8:55 am
Re: Missing alerts
You're right, this did not send out an alert it looks like. So it must've been the acknowledgement then. What I'll do is remove the host and re-add it and perform some user training.
Re: Missing alerts
Is there anything else you need help with, or am I all right to close this topic?tonyleatwork wrote:You're right, this did not send out an alert it looks like. So it must've been the acknowledgement then. What I'll do is remove the host and re-add it and perform some user training.
Former Nagios Employee.
me.
me.