Detective Work

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
omarrrthepirate
Posts: 28
Joined: Fri Jul 09, 2021 1:13 pm
Location: Spokane, WA
Contact:

Detective Work

Post by omarrrthepirate »

Hello. On early Monday morning, we had an outage of about 20 switches or so, all of which are in our Nagios XI. We didn't get a single email notification that they went down; we only knew because our old monitoring solution (WhatsUp Gold) is still running and it did email us. Nagios XI notifications have been working great up until that point. How do I go by investigating what exactly happened? Can I check to see whether it did see those devices down and whether it attempted to send emails or not? I should've also gotten SMS notifications as I am currently on-call but did not. Thank you for all your help in advance.
Omarrr The Pirate!
Arrrrr
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Detective Work

Post by pbroste »

Hello @ommarrrthepirate

Do you see anything in the eventman log?

Code: Select all

/usr/local/nagiosxi/var/eventman.log
May want to duplicate and then trigger a test to see if a service check will send an alert.

Thanks,
Perry
omarrrthepirate
Posts: 28
Joined: Fri Jul 09, 2021 1:13 pm
Location: Spokane, WA
Contact:

Re: Detective Work

Post by omarrrthepirate »

I get a "permission denied" when I run that command as root.
Omarrr The Pirate!
Arrrrr
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Detective Work

Post by pbroste »

pbroste wrote:Hello @ommarrrthepirate

Do you see anything in the eventman log?

Code: Select all

/usr/local/nagiosxi/var/eventman.log
May want to duplicate and then trigger a test to see if a service check will send an alert.

Thanks,
Perry
To get a list on the eventman log you would want to run this:

Code: Select all

cat /usr/local/nagiosxi/var/eventman.log| less -SR
Thanks,
Perry
omarrrthepirate
Posts: 28
Joined: Fri Jul 09, 2021 1:13 pm
Location: Spokane, WA
Contact:

Re: Detective Work

Post by omarrrthepirate »

This file is miles and miles long, and mostly had info about hosts. I need to investigate why Nagios XI is no longer sending email notifications when an endpoint goes down, because on the dashboard Nagios does show the host as being down; it's just no longer emailing us. Are there any better logs to look at or do I need to make my way through the entire eventman.log?
Omarrr The Pirate!
Arrrrr
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Detective Work

Post by pbroste »

@omarrrthepirate

Thanks for following up, to help narrow things down by searching for the particular alert (keyword) that you are interested in viewing in the eventman.log.

In my Nagios XI test environment is sending out emails that refer to:
  • [referer] => includes/components/xicore/xicore.inc.php > Event Handler Notification Email
To run through the eventman.log and search for some keywords from this line we want to use:
  • less /usr/local/nagiosxi/var/eventman.log | grep -Ei 'Event Handler Notification Email' -A 4 -B 4
    or less /usr/local/nagiosxi/var/eventman.log | grep -Ei 'xicore.inc.php' -A 4 -B 4
  • Or grep for the specific alert keyword
Please PM your updated system profile for us to review so we can also dig into what is going on.

To send us your system profile by:
  • Login to the Nagios XI GUI using a web browser.
  • Click the "Admin" > "System Profile" Menu
  • Click the "Download Profile" button
  • Save the profile.zip file and send via Private Message
Thanks,
Perry
Locked