False alerts on nothing being found
-
connected
False alerts on nothing being found
We have an Alert configured that is triggered when a certain text is no longer appearing in a log file for a certain period.
This was working fine before, but suddenly it keeps reporting that the text no longer has been found, while it's clearly there.
When we look at the history all is fine.
Run Time Status Alert Output Interval Lookback Warning Critical
Wed, 17 Feb 2021 13:00:19 +0100 OK OK: 4641 matching entries found |logs=4641;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 12:48:42 +0100 OK OK: 11507 matching entries found |logs=11507;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 12:18:26 +0100 OK OK: 11591 matching entries found |logs=11591;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:48:12 +0100 OK OK: 43092 matching entries found |logs=43092;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:18:10 +0100 OK OK: 3984 matching entries found |logs=3984;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:08:03 +0100 OK OK: 11802 matching entries found |logs=11802;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:38:02 +0100 OK OK: 507 matching entries found |logs=507;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:36:35 +0100 OK OK: 62 matching entries found |logs=62;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:36:26 +0100 OK OK: 42079 matching entries found |logs=42079;1:;1: 30m 30m 1: 1:
But the Alert get get via e-mail is stating the opposite.
Here is the full alert output:
CRITICAL: 0 matching entries found |logs=0;1:;1:
The last log from the alert query:
No matching logs found.
We've rebooted Nagios Log Server as there might be a hung process somewhere but this is not solving the issue.
Any clue on how to get to the root cause of this?
This was working fine before, but suddenly it keeps reporting that the text no longer has been found, while it's clearly there.
When we look at the history all is fine.
Run Time Status Alert Output Interval Lookback Warning Critical
Wed, 17 Feb 2021 13:00:19 +0100 OK OK: 4641 matching entries found |logs=4641;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 12:48:42 +0100 OK OK: 11507 matching entries found |logs=11507;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 12:18:26 +0100 OK OK: 11591 matching entries found |logs=11591;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:48:12 +0100 OK OK: 43092 matching entries found |logs=43092;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:18:10 +0100 OK OK: 3984 matching entries found |logs=3984;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 11:08:03 +0100 OK OK: 11802 matching entries found |logs=11802;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:38:02 +0100 OK OK: 507 matching entries found |logs=507;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:36:35 +0100 OK OK: 62 matching entries found |logs=62;1:;1: 30m 30m 1: 1:
Wed, 17 Feb 2021 10:36:26 +0100 OK OK: 42079 matching entries found |logs=42079;1:;1: 30m 30m 1: 1:
But the Alert get get via e-mail is stating the opposite.
Here is the full alert output:
CRITICAL: 0 matching entries found |logs=0;1:;1:
The last log from the alert query:
No matching logs found.
We've rebooted Nagios Log Server as there might be a hung process somewhere but this is not solving the issue.
Any clue on how to get to the root cause of this?
-
connected
Re: False alerts on nothing being found
Ok, now this is getting weirder.
I've deleted the alert and created a new one.
Keeps giving false alarms.
I've now de-activated the alarm and it still keeps mailing false alarms ever 30 minutes...
I've deleted the alert and created a new one.
Keeps giving false alarms.
I've now de-activated the alarm and it still keeps mailing false alarms ever 30 minutes...
-
connected
Re: False alerts on nothing being found
The e-mail has a reference to a no longer existing alert AWSD132lSptOOhacSd9u so it seems.
I can no longer open it though.
http://nagiosls.mydomain.lan/nagioslogs ... T19:07:27Z
I can no longer open it though.
http://nagiosls.mydomain.lan/nagioslogs ... T19:07:27Z
Re: False alerts on nothing being found
Is this alert in the configuration that was sent for the other issue? If so, what is the name?
I'd also like to get a profile from the NLS system. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:
/usr/local/nagioslogserver/scripts/profile.sh
This will create /tmp/system-profile.tar.gz.
Note that this file can be very large and may not be able to be uploaded through a private message because of its size. You can split the file into smaller files with the split command on the NLS(or other Linux machine) command line:
The above command will split the system-profile.tar.gz into 5MB segments and save them to files with the naming convention system-profile-nn.
I'd also like to get a profile from the NLS system. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:
/usr/local/nagioslogserver/scripts/profile.sh
This will create /tmp/system-profile.tar.gz.
Note that this file can be very large and may not be able to be uploaded through a private message because of its size. You can split the file into smaller files with the split command on the NLS(or other Linux machine) command line:
Code: Select all
split -b 5000000 /tmp/system-profile.tar.gz system-profile- -d
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
connected
Re: False alerts on nothing being found
This alert is in the same nagios Los Server installation yes. But it's different alert.
I'll send the System Profile shortly.
I'll send the System Profile shortly.
-
connected
Re: False alerts on nothing being found
FYI
I tried to delete the alert again by constructing the URL with the alert id but this also didn't help.
http://nagiosls.mydomain.lan/nagioslogs ... tOOhacSd9u
Hope you can find something in the logs.
I tried to delete the alert again by constructing the URL with the alert id but this also didn't help.
http://nagiosls.mydomain.lan/nagioslogs ... tOOhacSd9u
Hope you can find something in the logs.
Re: False alerts on nothing being found
Run the following to check the backend to see if the alert is still there:
Delete it if it is found:
Also delete the alert history:
Code: Select all
curl -XGET 'localhost:9200/nagioslogserver/alert/_search?q=_id:AWSD132lSptOOhacSd9u&pretty'
Code: Select all
curl -XDELETE 'localhost:9200/nagioslogserver/alert/_search?q=_id:AWSD132lSptOOhacSd9u&pretty'Code: Select all
curl -XDELETE 'localhost:9200/nagioslogserver_history'As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
connected
Re: False alerts on nothing being found
Thanks the the commands.
The alert seems already gone.
Delete the history too.
The alerts are still being mailed.
It truely are new alerts, so not old e-mails being stuck in the Exchange server.
What we do notice is that the e-mail states returned with a CRITICAL state at Fri, 19 Feb 2021 10:21:26 -0600
Which is strange because we are configured not on -0600 but on +0100.
The alert seems already gone.
Code: Select all
# curl -XGET 'localhost:9200/nagioslogserver/alert/_search?q=_id:AWSD132lSptOOhacSd9u&pretty'
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}The alerts are still being mailed.
It truely are new alerts, so not old e-mails being stuck in the Exchange server.
What we do notice is that the e-mail states returned with a CRITICAL state at Fri, 19 Feb 2021 10:21:26 -0600
Which is strange because we are configured not on -0600 but on +0100.
Re: False alerts on nothing being found
And the link in the emails still points to AWSD132lSptOOhacSd9u ?
It is very odd. I'd like to get a fresh copy of the nagioslogserver index as well as nagioslogserver_history:
It is very odd. I'd like to get a fresh copy of the nagioslogserver index as well as nagioslogserver_history:
Code: Select all
curl -XPOST http://localhost:9200/nagioslogserver/_export?path=/tmp/nagioslogserver.tar.gz
curl -XPOST http://localhost:9200/nagioslogserver_history/_export?path=/tmp/nagioslogserver_history.tar.gzAs of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
connected
Re: False alerts on nothing being found
We received a few more alerts after executing the commands.
So it did not stop right after the commands.
But now it seems to have stopped!
Last alert was Fri, 19 Feb 2021 16:52:21 -0600
So let's wait for the weekend
So it did not stop right after the commands.
But now it seems to have stopped!
Last alert was Fri, 19 Feb 2021 16:52:21 -0600
So let's wait for the weekend