Page 1 of 1
Detective Work
Posted: Wed Oct 06, 2021 1:24 pm
by omarrrthepirate
Hello. On early Monday morning, we had an outage of about 20 switches or so, all of which are in our Nagios XI. We didn't get a single email notification that they went down; we only knew because our old monitoring solution (WhatsUp Gold) is still running and it did email us. Nagios XI notifications have been working great up until that point. How do I go by investigating what exactly happened? Can I check to see whether it did see those devices down and whether it attempted to send emails or not? I should've also gotten SMS notifications as I am currently on-call but did not. Thank you for all your help in advance.
Re: Detective Work
Posted: Thu Oct 07, 2021 12:45 pm
by pbroste
Hello @ommarrrthepirate
Do you see anything in the eventman log?
Code: Select all
/usr/local/nagiosxi/var/eventman.log
May want to duplicate and then trigger a test to see if a service check will send an alert.
Thanks,
Perry
Re: Detective Work
Posted: Fri Oct 08, 2021 8:46 am
by omarrrthepirate
I get a "permission denied" when I run that command as root.
Re: Detective Work
Posted: Fri Oct 08, 2021 3:27 pm
by pbroste
pbroste wrote:Hello @ommarrrthepirate
Do you see anything in the eventman log?
Code: Select all
/usr/local/nagiosxi/var/eventman.log
May want to duplicate and then trigger a test to see if a service check will send an alert.
Thanks,
Perry
To get a list on the eventman log you would want to run this:
Code: Select all
cat /usr/local/nagiosxi/var/eventman.log| less -SR
Thanks,
Perry
Re: Detective Work
Posted: Tue Oct 12, 2021 12:26 pm
by omarrrthepirate
This file is miles and miles long, and mostly had info about hosts. I need to investigate why Nagios XI is no longer sending email notifications when an endpoint goes down, because on the dashboard Nagios does show the host as being down; it's just no longer emailing us. Are there any better logs to look at or do I need to make my way through the entire eventman.log?
Re: Detective Work
Posted: Wed Oct 13, 2021 10:17 am
by pbroste
@omarrrthepirate
Thanks for following up, to help narrow things down by searching for the particular alert (keyword) that you are interested in viewing in the eventman.log.
In my Nagios XI test environment is sending out emails that refer to:
[referer] => includes/components/xicore/xicore.inc.php > Event Handler Notification Email
To run through the eventman.log and search for some keywords from this line we want to use:
less /usr/local/nagiosxi/var/eventman.log | grep -Ei 'Event Handler Notification Email' -A 4 -B 4
or less /usr/local/nagiosxi/var/eventman.log | grep -Ei 'xicore.inc.php' -A 4 -B 4
- Or grep for the specific alert keyword
Please PM your updated system profile for us to review so we can also dig into what is going on.
To send us your system profile by:
- Login to the Nagios XI GUI using a web browser.
- Click the "Admin" > "System Profile" Menu
- Click the "Download Profile" button
- Save the profile.zip file and send via Private Message
Thanks,
Perry