Help with critical subsystem alerts

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
mkojder
Posts: 5
Joined: Wed Apr 20, 2016 6:30 am

Help with critical subsystem alerts

Post by mkojder »

Hey all,

So we run a pair of VMs and our ESX team messing with the SAN cabling that caused issues with the underlying storage.

We didn't notice until I tried to access Nagios Log Server today and saw unallocated shards and the elasticsearch service saying it was waiting to start. I rebooted the VMs and allocated the shards but I lost three days of logs....

There were no critical alerts during the outage...

SIS Nagios Critical Alert Mon, 11 Mar 2019 07:42:26 -0400 critical CRITICAL: 12 matching entries found |logs=12;5;10 5m 15m
SIS Nagios Critical Alert Mon, 11 Mar 2019 07:37:12 -0400 critical CRITICAL: 12 matching entries found |logs=12;5;10 5m 15m
SIS Nagios Critical Alert Mon, 11 Mar 2019 07:32:09 -0400 warning WARNING: 6 matching entries found |logs=6;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:16:26 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:16:26 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:11:21 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:06:11 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:01:11 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m 5 10
SIS Nagios Critical Alert Fri, 08 Mar 2019 15:56:06 -0500 ok OK: 0 matching entries found |logs=0;5;10

Is there any way to set a log level for say, logs received going to ZERO and that being a critical alert?

Thanks,

Matt
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Help with critical subsystem alerts

Post by scottwilkerson »

When you create an alert, if you hover over the ? next to the thresholds you will see a note stating to use 1: in the warning and critical values to alert if nothing is found
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
mkojder
Posts: 5
Joined: Wed Apr 20, 2016 6:30 am

Re: Help with critical subsystem alerts

Post by mkojder »

So if I -

Put :1 in both thresholds,

and select 'Only alert when Warning or Critical threshold is met' while putting the check and interval periods to 5 minutes,

I should get an alert saying no logs received for 5 minutes?

Matt
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Help with critical subsystem alerts

Post by scottwilkerson »

mkojder wrote:So if I -

Put :1 in both thresholds,

and select 'Only alert when Warning or Critical threshold is met' while putting the check and interval periods to 5 minutes,

I should get an alert saying no logs received for 5 minutes?

Matt
Yes, but it is 1: NOT :1
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
mkojder
Posts: 5
Joined: Wed Apr 20, 2016 6:30 am

Re: Help with critical subsystem alerts

Post by mkojder »

Thanks!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Help with critical subsystem alerts

Post by scottwilkerson »

mkojder wrote:Thanks!
Glad to help
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked