_grokparsefailure help
_grokparsefailure help
Guys/Gals,
I'm creating a nested ruleset for matching node.js log entries for an application we have. It appears that I'm capturing all the relevant data, yet still getting my entries tagged with _grokparsefailure.
Is there any more detailed troubleshooting that I can do to find the source of my problem? I tested my individual rules at http://grokdebug.herokuapp.com to make sure they're working.
I'll attach the filter and a sample of log entries if anyone wants to take a look or offer some advice.
I'm creating a nested ruleset for matching node.js log entries for an application we have. It appears that I'm capturing all the relevant data, yet still getting my entries tagged with _grokparsefailure.
Is there any more detailed troubleshooting that I can do to find the source of my problem? I tested my individual rules at http://grokdebug.herokuapp.com to make sure they're working.
I'll attach the filter and a sample of log entries if anyone wants to take a look or offer some advice.
You do not have the required permissions to view the files attached to this post.
Re: _grokparsefailure help
I will test this on my end. In the meantime, what kind of input is handling your logs? It's possible that a grokparsefailure will occur if you attempt sending these logs through a 'syslog' input especially.
http://kartar.net/2014/09/when-logstash ... -go-wrong/
http://kartar.net/2014/09/when-logstash ... -go-wrong/
Re: _grokparsefailure help
They're coming across from logstash-forwarder in lumberjack format. (So I can SSL encrypt)
Re: _grokparsefailure help
What's the actual input you are using, though? Is it syslog, lumberjack, or something else?
Former Nagios employee
Re: _grokparsefailure help
So the node.js application is outputting errors to a log in the format the developer specified. I am using logstash-forwarder to package the logs up, encrypt and ship them to the server via lumberjack format. So the incoming format sent is lumberjack. And the input filter is specifying lumberjack. I have this working for other apache and Nginx logs just not for the node.js ones.
Re: _grokparsefailure help
I cut your filter down dramatically (removed many mutates and if [file] checks):
This filter matched every log that I sent at it, except for this one:
I do not see a grok filter built for the above log, which explains why it wouldn't be matched.
By default, grok will use 'break_on_match=true', meaning that if a particular log matches one of your filters, it will stop running through the rest of your matches. If there is a grokparsefailure, however, that log will check the next match - and so on.
Try using the above and let me know if your results are improved - I do not understand why there are if [file] == checks, or why mutate is being used so extensively - I imagine that I'm missing something. Let me know.
Thanks!
Code: Select all
if [type] == 'node-logs' {
if( "BOUNCED" not in [tags]){
grok {
match => [
'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{NUMBER:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\] Error response received from FreeSWITCH - %{GREEDYDATA:ErrorMessage}' ]
match => [
'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{NUMBER:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\] Error response received from FreeSWITCH - %{GREEDYDATA:ErrorMessage}' ]
match => [
'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\]( \[%{INT:ResultCode};%{GREEDYDATA:ResultText};%{INT:NumberOfBadRows};%{INT:BadRowRun};%{WORD:ErrorCorrection};%{INT:PagesTransferred};%{INT:TotalPagesExpected}\])? Fax %{WORD:FaxStatus}: There are currently %{INT:ActiveFaxes} active faxes' ]
match => [
'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Joined conference: %{INT:ConferenceNumber} conference default' ]
match => [
'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Inbound fax: From:%{GREEDYDATA:From} To: %{INT:To}\(%{POSINT:KeystoneID}\)' ]
match => [
'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Inbound call: \{accountcode=%{INT:KeystoneID},orientation=%{WORD:Orientation}\}sofia\/%{WORD:Gateway}\/%{INT:DestinationNumber}\@%{IP:DestinationAddress}:%{POSINT:Port}' ]
match => [
'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Outbound call: \{accountcode=%{INT:KeystoneID},orientation=%{WORD:Orientation}\}sofia\/gateway\/%{WORD:OutboundGateway}\/%{INT:DialedNumber}' ]
}
}
}Code: Select all
Wed Sep 02 2015 08:43:39 GMT-0400 (EDT) [pksendfaxd] Received a fax request from [email protected]By default, grok will use 'break_on_match=true', meaning that if a particular log matches one of your filters, it will stop running through the rest of your matches. If there is a grokparsefailure, however, that log will check the next match - and so on.
Try using the above and let me know if your results are improved - I do not understand why there are if [file] == checks, or why mutate is being used so extensively - I imagine that I'm missing something. Let me know.
Thanks!
Re: _grokparsefailure help
I think a lot of those things are because I'm new to writing these filters. As such I use the tools I'm familiar with and the format I'm familiar with. I've yet to come across a good guide for writing the filters that makes sense to me. So it's a learn as I go sort of thing.
I did previously try using multiple match statements within grok, however they failed at the first match and never went beyond it. Hence why I checked for the file names. Logstash-forwarder sends the filename and any other specified fields as tags along with the data. So I used that to parse the logs.
I will give yours a shot and let you know how it works out. Thanks!
I did previously try using multiple match statements within grok, however they failed at the first match and never went beyond it. Hence why I checked for the file names. Logstash-forwarder sends the filename and any other specified fields as tags along with the data. So I used that to parse the logs.
I will give yours a shot and let you know how it works out. Thanks!
Re: _grokparsefailure help
No problem - let me know how it works for you. Thanks!
Re: _grokparsefailure help
Also I was using the file filters and matches to set the type field. Any way to do that while including your changes.
Last thing I forgot. The mutate is to drop the @message field so I'm not storing double data. I've got all the bits I need parsed into fields, why do I need to store the data again in the database as the redundant message field. I only want to keep it in case of parse failures. I planned on creating an alert for a mesage that doesn't match my parsing. I.E. The unknown and unexpected error.
Last thing I forgot. The mutate is to drop the @message field so I'm not storing double data. I've got all the bits I need parsed into fields, why do I need to store the data again in the database as the redundant message field. I only want to keep it in case of parse failures. I planned on creating an alert for a mesage that doesn't match my parsing. I.E. The unknown and unexpected error.
Re: _grokparsefailure help
Understood - dropping the message field costs CPU, while keeping it costs disk. If you'd like to have it removed after the match, feel free to add the 'drop message' statement to the end of the match statements.
Code: Select all
if [type] == 'node-logs' {
if( "BOUNCED" not in [tags]){
grok {
match => [
'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{NUMBER:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\] Error response received from FreeSWITCH - %{GREEDYDATA:ErrorMessage}' ]
match => [
'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{NUMBER:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\] Error response received from FreeSWITCH - %{GREEDYDATA:ErrorMessage}' ]
match => [
'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\]( \[%{INT:ResultCode};%{GREEDYDATA:ResultText};%{INT:NumberOfBadRows};%{INT:BadRowRun};%{WORD:ErrorCorrection};%{INT:PagesTransferred};%{INT:TotalPagesExpected}\])? Fax %{WORD:FaxStatus}: There are currently %{INT:ActiveFaxes} active faxes' ]
match => [
'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Joined conference: %{INT:ConferenceNumber} conference default' ]
match => [
'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Inbound fax: From:%{GREEDYDATA:From} To: %{INT:To}\(%{POSINT:KeystoneID}\)' ]
match => [
'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Inbound call: \{accountcode=%{INT:KeystoneID},orientation=%{WORD:Orientation}\}sofia\/%{WORD:Gateway}\/%{INT:DestinationNumber}\@%{IP:DestinationAddress}:%{POSINT:Port}' ]
match => [
'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Outbound call: \{accountcode=%{INT:KeystoneID},orientation=%{WORD:Orientation}\}sofia\/gateway\/%{WORD:OutboundGateway}\/%{INT:DialedNumber}' ]
}
if( "_grokparsefailure" not in [tags]) {
mutate { remove_field => "message" }
}
}
}