Troubleshooting Alerts

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
toleolu
Posts: 294
Joined: Fri Jul 19, 2013 7:02 pm
Location: Honolulu Hawaii

Troubleshooting Alerts

Post by toleolu »

Now that we are live on Nagios XI, I have been comparing the State History report with the Notifications report looking for state changes that don't have a corresponding alert notification going out.

I've come across a couple so what I have been doing is going into CCM and on the hosts and services, I look at the Check Settings and the Alert Settings. I get a variety of fields filled in, blank, etc. etc. but what I have been doing is on the Service Checks, I usually find that the Event Handler is blank, so I select the xi_service_event_handler then check the on button to enable the event handler.

Then I check the Alert Settings, check Manage Contacts and Manage Contact groups to make sure those are correct, I select xi_timeperiod_24x7 under Notification Period, I check w, c, and r for Notification options, notification interval to 0 (don't want additional emails going out) and first notification delay is blank or zero. I then check the Notification enabled ON button.

I haven't had a chance to see if any of the ones I have done like this fix the problem because no additional alerts have been generated on the few that I did. I just wanted to check with the pros and make sure there was nothing wrong with what I'm doing here, or if I might be missing something.

Mahalo
Charles Masteller
Information Systems Specialist
Hawaii Health Systems Corp.
"No one will ever need more than 640K RAM". Bill Gates
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Troubleshooting Alerts

Post by tmcdonald »

One thing I would caution against is overwriting changes set by templates. If a required field is blank and you are still able to save, chances are that field is set in a template somewhere. Also, remember that some templates inherit from others.
Former Nagios employee
toleolu
Posts: 294
Joined: Fri Jul 19, 2013 7:02 pm
Location: Honolulu Hawaii

Re: Troubleshooting Alerts

Post by toleolu »

Ran the reports this morning and I sill have alerts showing up, but nothing in the notifications. Specifically, in the State History report for the last 24 hours, there is a warning then critical change on a server drive at 21:09, looking at the Notifications report for the same period, there is no record of any notifications being sent out at that time for that server. There are a couple of other instances like that on other servers, but overall everything matches up pretty good, if state change is detected, then there's a record of a notification being sent.

So if my thinking is correct here, and if I see an alert change in alert history, I should see something in the notifications, what should I be looking at with respect to this particular server. The check setting are every 5 minutes with 3, 1 minute retries. Alerts are set to go out immediately, and it only sends one email. We're only doing alerts and notifications on Warnings, Critical, Down, and Recovery, so pretty simple.

Of note, this particular server is still running the old agent, but I have other servers still running the old agents and they seem to be working OK. Additionally, I have a couple of servers that have the new agents and were set up via the Monitoring Wizard, doing the same thing. I have an instance of the same thing happening about an hour later last night on one of the servers with the new agent, drive state changed from warning to critical and no notifications went out.

All the contact settings for individual contacts and groups are correct, those things I tried didn't seem to work so any suggestions on where to look would be appreciated.

Mahalo
Charles Masteller
Information Systems Specialist
Hawaii Health Systems Corp.
"No one will ever need more than 640K RAM". Bill Gates
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Troubleshooting Alerts

Post by scottwilkerson »

In the state history report are you just looking at HARD changes or soft as well?

Notifications will not go out for SOFT changes. Additionally, if a Host is down, service notifications will not go out. Finally, if the host has a parent host and the parent is down, the notification would be a UNREACHABLE which you may not be sending.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
toleolu
Posts: 294
Joined: Fri Jul 19, 2013 7:02 pm
Location: Honolulu Hawaii

Re: Troubleshooting Alerts

Post by toleolu »

Yes Scott, they are Hard changes.

We had one of our main DHCP servers go down over the weekend. Nagios XI did not generate any alerts on that server, but fortunately I had brought the old Nagios system back up on Friday, and our Help Desk got the alert from the old system.

How do I go about setting up a support call for someone to remote in, take a look at what's going on, and point me in the right direction on how to fix this?\

Mahalo
Charles Masteller
Information Systems Specialist
Hawaii Health Systems Corp.
"No one will ever need more than 640K RAM". Bill Gates
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Troubleshooting Alerts

Post by lmiltchev »

You can send an email to [email protected]. This will open a new email support ticket in our system. Then, we will set up a time for the remote, and will be sending you a remote session link.
Be sure to check out our Knowledgebase for helpful articles and solutions!
toleolu
Posts: 294
Joined: Fri Jul 19, 2013 7:02 pm
Location: Honolulu Hawaii

Re: Troubleshooting Alerts

Post by toleolu »

Thanks, just submitted it.

I have access to all of this from home, so given the time difference between MN and HI, I can get up early in the morning and connect with you from home if that works best for you.

Mahalo
Charles Masteller
Information Systems Specialist
Hawaii Health Systems Corp.
"No one will ever need more than 640K RAM". Bill Gates
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Troubleshooting Alerts

Post by lmiltchev »

You opened a new support ticket in our system. We will continue communicating via emails. I am locking this topic.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked