Page 1 of 1

_grokparsefailure help

Posted: Fri Sep 11, 2015 9:09 am
by weveland
Guys/Gals,

I'm creating a nested ruleset for matching node.js log entries for an application we have. It appears that I'm capturing all the relevant data, yet still getting my entries tagged with _grokparsefailure.
Is there any more detailed troubleshooting that I can do to find the source of my problem? I tested my individual rules at http://grokdebug.herokuapp.com to make sure they're working.

I'll attach the filter and a sample of log entries if anyone wants to take a look or offer some advice.
filterandlogs.tar.gz

Re: _grokparsefailure help

Posted: Fri Sep 11, 2015 10:41 am
by jolson
I will test this on my end. In the meantime, what kind of input is handling your logs? It's possible that a grokparsefailure will occur if you attempt sending these logs through a 'syslog' input especially.

http://kartar.net/2014/09/when-logstash ... -go-wrong/

Re: _grokparsefailure help

Posted: Fri Sep 11, 2015 11:12 am
by weveland
They're coming across from logstash-forwarder in lumberjack format. (So I can SSL encrypt)

Re: _grokparsefailure help

Posted: Fri Sep 11, 2015 2:31 pm
by tmcdonald
What's the actual input you are using, though? Is it syslog, lumberjack, or something else?

Re: _grokparsefailure help

Posted: Fri Sep 11, 2015 2:38 pm
by weveland
So the node.js application is outputting errors to a log in the format the developer specified. I am using logstash-forwarder to package the logs up, encrypt and ship them to the server via lumberjack format. So the incoming format sent is lumberjack. And the input filter is specifying lumberjack. I have this working for other apache and Nginx logs just not for the node.js ones.

Re: _grokparsefailure help

Posted: Mon Sep 14, 2015 10:09 am
by jolson
I cut your filter down dramatically (removed many mutates and if [file] checks):

Code: Select all

if [type] == 'node-logs' {
        if( "BOUNCED" not in [tags]){
                grok {
                    match => [ 
                            'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{NUMBER:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\] Error response received from FreeSWITCH - %{GREEDYDATA:ErrorMessage}' ]
                    match => [ 
                                        'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{NUMBER:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\] Error response received from FreeSWITCH - %{GREEDYDATA:ErrorMessage}' ]
                    match => [ 
                                        'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\]( \[%{INT:ResultCode};%{GREEDYDATA:ResultText};%{INT:NumberOfBadRows};%{INT:BadRowRun};%{WORD:ErrorCorrection};%{INT:PagesTransferred};%{INT:TotalPagesExpected}\])? Fax %{WORD:FaxStatus}: There are currently %{INT:ActiveFaxes} active faxes' ]
                    match => [ 
                                        'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Joined conference: %{INT:ConferenceNumber} conference default' ]
                    match => [ 
                                        'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Inbound fax: From:%{GREEDYDATA:From} To: %{INT:To}\(%{POSINT:KeystoneID}\)' ]
                    match => [ 
                                        'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Inbound call: \{accountcode=%{INT:KeystoneID},orientation=%{WORD:Orientation}\}sofia\/%{WORD:Gateway}\/%{INT:DestinationNumber}\@%{IP:DestinationAddress}:%{POSINT:Port}' ]
                    match => [ 
                            'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Outbound call: \{accountcode=%{INT:KeystoneID},orientation=%{WORD:Orientation}\}sofia\/gateway\/%{WORD:OutboundGateway}\/%{INT:DialedNumber}' ]
                }
    }
}
This filter matched every log that I sent at it, except for this one:

Code: Select all

Wed Sep 02 2015 08:43:39 GMT-0400 (EDT) [pksendfaxd] Received a fax request from [email protected]
I do not see a grok filter built for the above log, which explains why it wouldn't be matched.

By default, grok will use 'break_on_match=true', meaning that if a particular log matches one of your filters, it will stop running through the rest of your matches. If there is a grokparsefailure, however, that log will check the next match - and so on.

Try using the above and let me know if your results are improved - I do not understand why there are if [file] == checks, or why mutate is being used so extensively - I imagine that I'm missing something. Let me know.

Thanks!

Re: _grokparsefailure help

Posted: Mon Sep 14, 2015 4:23 pm
by weveland
I think a lot of those things are because I'm new to writing these filters. As such I use the tools I'm familiar with and the format I'm familiar with. I've yet to come across a good guide for writing the filters that makes sense to me. So it's a learn as I go sort of thing.

I did previously try using multiple match statements within grok, however they failed at the first match and never went beyond it. Hence why I checked for the file names. Logstash-forwarder sends the filename and any other specified fields as tags along with the data. So I used that to parse the logs.

I will give yours a shot and let you know how it works out. Thanks!

Re: _grokparsefailure help

Posted: Mon Sep 14, 2015 4:29 pm
by jolson
No problem - let me know how it works for you. Thanks!

Re: _grokparsefailure help

Posted: Mon Sep 14, 2015 4:36 pm
by weveland
Also I was using the file filters and matches to set the type field. Any way to do that while including your changes.

Last thing I forgot. The mutate is to drop the @message field so I'm not storing double data. I've got all the bits I need parsed into fields, why do I need to store the data again in the database as the redundant message field. I only want to keep it in case of parse failures. I planned on creating an alert for a mesage that doesn't match my parsing. I.E. The unknown and unexpected error.

Re: _grokparsefailure help

Posted: Tue Sep 15, 2015 9:44 am
by jolson
Understood - dropping the message field costs CPU, while keeping it costs disk. If you'd like to have it removed after the match, feel free to add the 'drop message' statement to the end of the match statements.

Code: Select all

    if [type] == 'node-logs' {
            if( "BOUNCED" not in [tags]){
                    grok {
                        match => [
                                'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{NUMBER:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\] Error response received from FreeSWITCH - %{GREEDYDATA:ErrorMessage}' ]
                        match => [
                                            'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{NUMBER:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\] Error response received from FreeSWITCH - %{GREEDYDATA:ErrorMessage}' ]
                        match => [
                                            'message', '(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] \[%{UUID}\]( \[%{INT:ResultCode};%{GREEDYDATA:ResultText};%{INT:NumberOfBadRows};%{INT:BadRowRun};%{WORD:ErrorCorrection};%{INT:PagesTransferred};%{INT:TotalPagesExpected}\])? Fax %{WORD:FaxStatus}: There are currently %{INT:ActiveFaxes} active faxes' ]
                        match => [
                                            'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Joined conference: %{INT:ConferenceNumber} conference default' ]
                        match => [
                                            'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Inbound fax: From:%{GREEDYDATA:From} To: %{INT:To}\(%{POSINT:KeystoneID}\)' ]
                        match => [
                                            'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Inbound call: \{accountcode=%{INT:KeystoneID},orientation=%{WORD:Orientation}\}sofia\/%{WORD:Gateway}\/%{INT:DestinationNumber}\@%{IP:DestinationAddress}:%{POSINT:Port}' ]
                        match => [
                                'message', '(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} GMT%{INT:GmtOffset} \(%{TZ}\)) \[%{PROG:program}\] Outbound call: \{accountcode=%{INT:KeystoneID},orientation=%{WORD:Orientation}\}sofia\/gateway\/%{WORD:OutboundGateway}\/%{INT:DialedNumber}' ]
                    }
            if( "_grokparsefailure" not in [tags]) {
                    mutate { remove_field => "message" }
                }
        }
    }