Page 1 of 1

Query to filter the logs and create alerts

Posted: Wed Jun 28, 2017 5:17 am
by rkpotdar
Hello Team,
I am working on creating a query on dashboard and using it for email alert generation.

The query is something like 1.1.1.1 && apache_error* && 404 (under Load query tab where user can add and the purpose of this query is to check for specific IP with apache error 404).
It works fine and filters out the logs from a long list of log trace on the dashboard and the same query is saved.

But this query when applied to generate an email alert fails and the alert keeps sending OK messages all the time.
I have taken care of Log interval and loopback interval (say 2m for both) for alert creation and given the Warning and Critical counts to 1 each.

I could see in Advanced Manager query (user email alerts - edit query) that the filter is copied as is and I am not so sure if this exact duplication of query string
will give us the expected result.

Why can't we create and edit the query on the NLS dashboard and use the same in generating alerts?
During the time when errors are induced, they can filtered on dashboard (using Load query option) but alerts are missed with OK messages.

Is there a link where I can refer to, for creating queries and using them in generating email alerts?

Re: Query to filter the logs and create alerts

Posted: Wed Jun 28, 2017 8:13 am
by mcapra
rkpotdar wrote: Why can't we create and edit the query on the NLS dashboard and use the same in generating alerts?
At least as of Nagios Log Server 1.4.4, the queries used by alerting are pass-by-value rather than pass-by-reference. When you create an alert from an existing query, that query is "copied" over to the alert. So, if you go back and change the query later on, you would need to re-create the alert.

I remember having some discussions around this design choice a while back. Not sure what ever came of it. Perhaps one of the techs has a task ID to reference?

Re: Query to filter the logs and create alerts

Posted: Wed Jun 28, 2017 4:46 pm
by cdienger
The OK message means the thresholds on the alert are not getting met. Double check the settings there and if more help is needed, provide screenshots of the dashboard highlighting the query, the OK message including the alert output field, the Alert settings including the Advanced section showing the query, and the mail settings like under Administration > General > Mail Settings. You may find some useful debug information in /var/log/maillog.

Regarding the alert queries being pass-by-value, it's not likely it'll be changing it at the moment since it would be a problem in cases where the query was deleted or if the alert name was changed. This is mentioned in some revised documentation when the next NLS version is released(www.nagios.com/roadmaps).