Our NOC is using the Operations Screen to monitor alerts from Nagios. It is my understanding when an alert fires, that alert is displayed on the screen and recorded in the event log.
We had a UPS device switch to battery power due to a power failure. At the bottome of the attached document and alert is fired to the NOC. There is no corresponding alert found in
the event log. Should there be? Also, there is no events recorded from the last notification that was sent out to when it was down. (35 Mins) Also, when power was restored not all alerts
appear to be recorded. Am I not understanding what is governing what the event log records?
Thanks,
Greg
Event Log
Re: Event Log
Well, I believe it is logged. The critical service alert is 3 entries above the entry you highlighted, and the notification is right above it.
I am not sure what is the problem.
I am not sure what is the problem.
You do not have the required permissions to view the files attached to this post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Event Log
I think the problem is what I was expecting to find in the event log versus how events are logged.
Nagios issues 3 warnings which all three are recorded in the event log. This corresponds to check settings of check interval, retry interval and max attempts: 10 - 2 - 3.
But the Critical alert is only in the event log once showing Critical the battery capacity at 31%. The alert captured below from the Operations Screen is showing the Critical Battery capacity at 11%.
So I am wondering why that was recorded the event log like the warnings were.
Nagios issues 3 warnings which all three are recorded in the event log. This corresponds to check settings of check interval, retry interval and max attempts: 10 - 2 - 3.
But the Critical alert is only in the event log once showing Critical the battery capacity at 31%. The alert captured below from the Operations Screen is showing the Critical Battery capacity at 11%.
So I am wondering why that was recorded the event log like the warnings were.
Re: Event Log
The discrepancy in the crit battery capacity value could be probably explained by the fact that the last check in Operations screen is at 2015-05-20 13:45:04 (20 min. after the last state change). I am not sure why this does not appear in the log. Have you tried grepping the log? You can try something like this:
Code: Select all
grep "APC UPS CHECK" /usr/local/nagios/var/nagios.log | grep "CRIT BATTERY CAPACITY" | perl -pe 's/(\d+)/localtime($1)/e'Be sure to check out our Knowledgebase for helpful articles and solutions!