Page 1 of 1

_grokparsefailure on custom httpd log

Posted: Thu Nov 20, 2014 10:33 am
by globalgiving
My log lines look like this:

Code: Select all

[190.41.191.6, 10.1.0.6] - - [20/Nov/2014:10:17:25 -0500] "GET /dy/v2/widget/projects.html?projectIds=7887&ggtid=4F81C017B6215C3B9C5080352E974EFC&callback=cfisjzvnqcqerqtu.p[0] HTTP/1.1" 200 373 0 "http://gruporedes.org/trabajo-infantil-domestico/" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
The config setting in apache looks like this:

Code: Select all

LogFormat "[%{X-Forwarded-For}i] %l %u %t \"%r\" %>s %b %T \"%{Referer}i\" \"%{User-Agent}i\"" proxy
And this is the filter in log server:

Code: Select all

if [program] == 'apache_access' {
    grok {
        match => [ 'message', '\[%{IPORHOST:clientip}(?:, %{NOTSPACE:clientiptoo}){0,1}\] %{USER:ident} %{NOTSPACE:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NUMBER:servetime} %{QS:referrer} %{QS:agent}']
    }
    date {
        match => [ 'timestamp', 'dd/MMM/yyyy:HH:mm:ss Z' ]
    }
    mutate {
        replace => [ 'type', 'apache_access' ]
         convert => [ 'bytes', 'integer' ]
         convert => [ 'response', 'integer' ]
    }
}
And this is the "raw" output i see for one of the log lines in the dashboard:

Code: Select all

{
  "_index": "logstash-2014.11.20",
  "_type": "apache_access",
  "_id": "CwDvEJbgTQCOJ_Kv6vNovw",
  "_score": null,
  "_source": {
    "message": "[24.91.195.10, 10.1.0.6] - - [20/Nov/2014:10:28:26 -0500] \"GET /dy/v2/pe/manage/donations.html?project.projId=16721 HTTP/1.1\" 200 15952 0 \"https://www.globalgiving.org/dy/v2/pe/dashboard/overview.html\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.1.25 (KHTML, like Gecko) Version/8.0 Safari/600.1.25\"",
    "@version": "1",
    "@timestamp": "2014-11-20T15:28:31.000Z",
    "type": "apache_access",
    "host": "10.1.0.100",
    "priority": 133,
    "timestamp": "Nov 20 10:28:31",
    "logsource": "tibet",
    "program": "apache_access",
    "severity": 5,
    "facility": 16,
    "facility_label": "local0",
    "severity_label": "Notice",
    "tags": [
      "_grokparsefailure"
    ]
  },
  "sort": [
    1416497311000,
    1416497311000
  ]
}
I have tried out my grok pattern and the above example log line at https://grokdebug.herokuapp.com/ and it shows that it should work. So I am very confused about what is wrong, or where I should look from here to figure it out.

I made the edit to apply my custom grok pattern under "Global Configuration", and edited the existing "Apache (default)" filter. Then clicked "Save & Apply", then clicked "Apply" on the next screen. I have also run the "Verify" and it shows everything is ok with the config. Any suggestions on what I should do?

Re: _grokparsefailure on custom httpd log

Posted: Thu Nov 20, 2014 12:37 pm
by lgroschen
match => [ 'message', '\[%{IPORHOST:clientip}(?:, %{NOTSPACE:clientiptoo}){0,1}\] %{USER:ident} %{NOTSPACE:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NUMBER:servetime} %{QS:referrer} %{QS:agent}']
}
The thing that jumps out at me is that you have the match properly inside single quotes, but you aren't escaping your double quotes. Try that and see if you get the parse failure still. If that doesn't fix it then you can try this cool trick by creating a log file for any grokparsefailures:

Code: Select all

output {
  if [type] == "syslog" and "_grokparsefailure" in [tags] {
    file { path => "/var/log/failed_syslog_events-%{+YYYY-MM-dd}" }
  }

Re: _grokparsefailure on custom httpd log

Posted: Thu Nov 20, 2014 1:36 pm
by globalgiving
Tried escaping the double quotes, no joy.

Added the output you mentioned (had to modify it to look for apache_access type instead of syslog). Here is one of the lines from that:

Code: Select all

{"message":"[54.251.34.67, 10.1.0.6] - - [20/Nov/2014:13:26:11 -0500] \"GET /dy/v2/buildstatus HTTP/1.0\" 200 300 0 \"-\" \"NewRelicPinger/1.0 (218470)\"","@version":"1","@timestamp":"2014-11-20T18:26:20.000Z","type":"apache_access","host":"10.1.0.100","priority":133,"timestamp":"Nov 20 13:26:20","logsource":"tibet","program":"apache_access","severity":5,"facility":16,"facility_label":"local0","severity_label":"Notice","tags":["_grokparsefailure"]}
I should mention that the logstash.log shows it is getting an error with the timestamp. But I had dismissed that as being due to it not parsing the message correctly. If it was parsing the message, then it would be pulling the right timestamp format from there, and then it would not be having a trouble parsing the timestamp. Here is the error I see in logstash.log (does not correlate to the log line above, i just grabbed a random one):

Code: Select all

{:timestamp=>"2014-11-20T13:25:30.566000-0500", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"Nov 20 13:25:30", :exception=>java.lang.IllegalArgumentException: Invalid format: "Nov 20 13:25:30", :level=>:warn}
I have attached my logstash configs, just in case you need to view them.

Re: _grokparsefailure on custom httpd log

Posted: Thu Nov 20, 2014 5:28 pm
by globalgiving
Figured out my problem.

It turns out that the X-Forwarded-For header may be putting a tab or some other white space between the IPs.. I assumed it was a space. Changed my pattern to:

Code: Select all

\[%{IPORHOST:clientip}(?:\,\s%{NOTSPACE:clientiptoo}){0,1}\] %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NUMBER:servetime} %{QS:referrer} %{QS:agent}
And it works just fine now. Thanks for the tip on creating the log output. It let me see that when there was only 1 IP in the header, that it was parsing correctly and only failing when more than one.

Re: _grokparsefailure on custom httpd log

Posted: Thu Nov 20, 2014 5:45 pm
by lgroschen
Yes I did see the hyphens sitting there all lonesome. Glad you have it working!


Locked.