Help with critical subsystem alerts
Posted: Mon Mar 11, 2019 9:19 am
Hey all,
So we run a pair of VMs and our ESX team messing with the SAN cabling that caused issues with the underlying storage.
We didn't notice until I tried to access Nagios Log Server today and saw unallocated shards and the elasticsearch service saying it was waiting to start. I rebooted the VMs and allocated the shards but I lost three days of logs....
There were no critical alerts during the outage...
SIS Nagios Critical Alert Mon, 11 Mar 2019 07:42:26 -0400 critical CRITICAL: 12 matching entries found |logs=12;5;10 5m 15m
SIS Nagios Critical Alert Mon, 11 Mar 2019 07:37:12 -0400 critical CRITICAL: 12 matching entries found |logs=12;5;10 5m 15m
SIS Nagios Critical Alert Mon, 11 Mar 2019 07:32:09 -0400 warning WARNING: 6 matching entries found |logs=6;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:16:26 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:16:26 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:11:21 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:06:11 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:01:11 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m 5 10
SIS Nagios Critical Alert Fri, 08 Mar 2019 15:56:06 -0500 ok OK: 0 matching entries found |logs=0;5;10
Is there any way to set a log level for say, logs received going to ZERO and that being a critical alert?
Thanks,
Matt
So we run a pair of VMs and our ESX team messing with the SAN cabling that caused issues with the underlying storage.
We didn't notice until I tried to access Nagios Log Server today and saw unallocated shards and the elasticsearch service saying it was waiting to start. I rebooted the VMs and allocated the shards but I lost three days of logs....
There were no critical alerts during the outage...
SIS Nagios Critical Alert Mon, 11 Mar 2019 07:42:26 -0400 critical CRITICAL: 12 matching entries found |logs=12;5;10 5m 15m
SIS Nagios Critical Alert Mon, 11 Mar 2019 07:37:12 -0400 critical CRITICAL: 12 matching entries found |logs=12;5;10 5m 15m
SIS Nagios Critical Alert Mon, 11 Mar 2019 07:32:09 -0400 warning WARNING: 6 matching entries found |logs=6;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:16:26 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:16:26 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:11:21 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:06:11 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m
SIS Nagios Critical Alert Fri, 08 Mar 2019 16:01:11 -0500 ok OK: 0 matching entries found |logs=0;5;10 5m 15m 5 10
SIS Nagios Critical Alert Fri, 08 Mar 2019 15:56:06 -0500 ok OK: 0 matching entries found |logs=0;5;10
Is there any way to set a log level for say, logs received going to ZERO and that being a critical alert?
Thanks,
Matt