False alerts on nothing being found
Posted: Wed Feb 17, 2021 7:25 am
We have an Alert configured that is triggered when a certain text is no longer appearing in a log file for a certain period.
This was working fine before, but suddenly it keeps reporting that the text no longer has been found, while it's clearly there.
When we look at the history all is fine.
Run Time Status Alert Output Interval Lookback Warning Critical
Wed, 17 Feb 2021 13:00:19 +0100 OK OK: 4641 matching entries found |logs=4641;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 12:48:42 +0100 OK OK: 11507 matching entries found |logs=11507;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 12:18:26 +0100 OK OK: 11591 matching entries found |logs=11591;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:48:12 +0100 OK OK: 43092 matching entries found |logs=43092;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:18:10 +0100 OK OK: 3984 matching entries found |logs=3984;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:08:03 +0100 OK OK: 11802 matching entries found |logs=11802;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:38:02 +0100 OK OK: 507 matching entries found |logs=507;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:36:35 +0100 OK OK: 62 matching entries found |logs=62;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:36:26 +0100 OK OK: 42079 matching entries found |logs=42079;1:;1: 30m 30m 1: 1:
But the Alert get get via e-mail is stating the opposite.
Here is the full alert output:
CRITICAL: 0 matching entries found |logs=0;1:;1:
The last log from the alert query:
No matching logs found.
We've rebooted Nagios Log Server as there might be a hung process somewhere but this is not solving the issue.
Any clue on how to get to the root cause of this?
This was working fine before, but suddenly it keeps reporting that the text no longer has been found, while it's clearly there.
When we look at the history all is fine.
Run Time Status Alert Output Interval Lookback Warning Critical
Wed, 17 Feb 2021 13:00:19 +0100 OK OK: 4641 matching entries found |logs=4641;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 12:48:42 +0100 OK OK: 11507 matching entries found |logs=11507;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 12:18:26 +0100 OK OK: 11591 matching entries found |logs=11591;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:48:12 +0100 OK OK: 43092 matching entries found |logs=43092;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:18:10 +0100 OK OK: 3984 matching entries found |logs=3984;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:08:03 +0100 OK OK: 11802 matching entries found |logs=11802;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:38:02 +0100 OK OK: 507 matching entries found |logs=507;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:36:35 +0100 OK OK: 62 matching entries found |logs=62;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:36:26 +0100 OK OK: 42079 matching entries found |logs=42079;1:;1: 30m 30m 1: 1:
But the Alert get get via e-mail is stating the opposite.
Here is the full alert output:
CRITICAL: 0 matching entries found |logs=0;1:;1:
The last log from the alert query:
No matching logs found.
We've rebooted Nagios Log Server as there might be a hung process somewhere but this is not solving the issue.
Any clue on how to get to the root cause of this?