_grokparsefailure on custom httpd log

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
User avatar
globalgiving
Posts: 25
Joined: Thu Aug 28, 2014 9:57 am
Location: Washington, DC
Contact:

_grokparsefailure on custom httpd log

Post by globalgiving »

My log lines look like this:

Code: Select all

[190.41.191.6, 10.1.0.6] - - [20/Nov/2014:10:17:25 -0500] "GET /dy/v2/widget/projects.html?projectIds=7887&ggtid=4F81C017B6215C3B9C5080352E974EFC&callback=cfisjzvnqcqerqtu.p[0] HTTP/1.1" 200 373 0 "http://gruporedes.org/trabajo-infantil-domestico/" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
The config setting in apache looks like this:

Code: Select all

LogFormat "[%{X-Forwarded-For}i] %l %u %t \"%r\" %>s %b %T \"%{Referer}i\" \"%{User-Agent}i\"" proxy
And this is the filter in log server:

Code: Select all

if [program] == 'apache_access' {
    grok {
        match => [ 'message', '\[%{IPORHOST:clientip}(?:, %{NOTSPACE:clientiptoo}){0,1}\] %{USER:ident} %{NOTSPACE:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NUMBER:servetime} %{QS:referrer} %{QS:agent}']
    }
    date {
        match => [ 'timestamp', 'dd/MMM/yyyy:HH:mm:ss Z' ]
    }
    mutate {
        replace => [ 'type', 'apache_access' ]
         convert => [ 'bytes', 'integer' ]
         convert => [ 'response', 'integer' ]
    }
}
And this is the "raw" output i see for one of the log lines in the dashboard:

Code: Select all

{
  "_index": "logstash-2014.11.20",
  "_type": "apache_access",
  "_id": "CwDvEJbgTQCOJ_Kv6vNovw",
  "_score": null,
  "_source": {
    "message": "[24.91.195.10, 10.1.0.6] - - [20/Nov/2014:10:28:26 -0500] \"GET /dy/v2/pe/manage/donations.html?project.projId=16721 HTTP/1.1\" 200 15952 0 \"https://www.globalgiving.org/dy/v2/pe/dashboard/overview.html\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.1.25 (KHTML, like Gecko) Version/8.0 Safari/600.1.25\"",
    "@version": "1",
    "@timestamp": "2014-11-20T15:28:31.000Z",
    "type": "apache_access",
    "host": "10.1.0.100",
    "priority": 133,
    "timestamp": "Nov 20 10:28:31",
    "logsource": "tibet",
    "program": "apache_access",
    "severity": 5,
    "facility": 16,
    "facility_label": "local0",
    "severity_label": "Notice",
    "tags": [
      "_grokparsefailure"
    ]
  },
  "sort": [
    1416497311000,
    1416497311000
  ]
}
I have tried out my grok pattern and the above example log line at https://grokdebug.herokuapp.com/ and it shows that it should work. So I am very confused about what is wrong, or where I should look from here to figure it out.

I made the edit to apply my custom grok pattern under "Global Configuration", and edited the existing "Apache (default)" filter. Then clicked "Save & Apply", then clicked "Apply" on the next screen. I have also run the "Verify" and it shows everything is ok with the config. Any suggestions on what I should do?
Justin Rupp
Senior Systems Ninja
GlobalGiving Foundation
User avatar
lgroschen
Posts: 384
Joined: Wed Nov 27, 2013 1:17 pm

Re: _grokparsefailure on custom httpd log

Post by lgroschen »

match => [ 'message', '\[%{IPORHOST:clientip}(?:, %{NOTSPACE:clientiptoo}){0,1}\] %{USER:ident} %{NOTSPACE:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NUMBER:servetime} %{QS:referrer} %{QS:agent}']
}
The thing that jumps out at me is that you have the match properly inside single quotes, but you aren't escaping your double quotes. Try that and see if you get the parse failure still. If that doesn't fix it then you can try this cool trick by creating a log file for any grokparsefailures:

Code: Select all

output {
  if [type] == "syslog" and "_grokparsefailure" in [tags] {
    file { path => "/var/log/failed_syslog_events-%{+YYYY-MM-dd}" }
  }
/Luke
User avatar
globalgiving
Posts: 25
Joined: Thu Aug 28, 2014 9:57 am
Location: Washington, DC
Contact:

Re: _grokparsefailure on custom httpd log

Post by globalgiving »

Tried escaping the double quotes, no joy.

Added the output you mentioned (had to modify it to look for apache_access type instead of syslog). Here is one of the lines from that:

Code: Select all

{"message":"[54.251.34.67, 10.1.0.6] - - [20/Nov/2014:13:26:11 -0500] \"GET /dy/v2/buildstatus HTTP/1.0\" 200 300 0 \"-\" \"NewRelicPinger/1.0 (218470)\"","@version":"1","@timestamp":"2014-11-20T18:26:20.000Z","type":"apache_access","host":"10.1.0.100","priority":133,"timestamp":"Nov 20 13:26:20","logsource":"tibet","program":"apache_access","severity":5,"facility":16,"facility_label":"local0","severity_label":"Notice","tags":["_grokparsefailure"]}
I should mention that the logstash.log shows it is getting an error with the timestamp. But I had dismissed that as being due to it not parsing the message correctly. If it was parsing the message, then it would be pulling the right timestamp format from there, and then it would not be having a trouble parsing the timestamp. Here is the error I see in logstash.log (does not correlate to the log line above, i just grabbed a random one):

Code: Select all

{:timestamp=>"2014-11-20T13:25:30.566000-0500", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"Nov 20 13:25:30", :exception=>java.lang.IllegalArgumentException: Invalid format: "Nov 20 13:25:30", :level=>:warn}
I have attached my logstash configs, just in case you need to view them.
You do not have the required permissions to view the files attached to this post.
Justin Rupp
Senior Systems Ninja
GlobalGiving Foundation
User avatar
globalgiving
Posts: 25
Joined: Thu Aug 28, 2014 9:57 am
Location: Washington, DC
Contact:

Re: _grokparsefailure on custom httpd log

Post by globalgiving »

Figured out my problem.

It turns out that the X-Forwarded-For header may be putting a tab or some other white space between the IPs.. I assumed it was a space. Changed my pattern to:

Code: Select all

\[%{IPORHOST:clientip}(?:\,\s%{NOTSPACE:clientiptoo}){0,1}\] %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NUMBER:servetime} %{QS:referrer} %{QS:agent}
And it works just fine now. Thanks for the tip on creating the log output. It let me see that when there was only 1 IP in the header, that it was parsing correctly and only failing when more than one.
Justin Rupp
Senior Systems Ninja
GlobalGiving Foundation
User avatar
lgroschen
Posts: 384
Joined: Wed Nov 27, 2013 1:17 pm

Re: _grokparsefailure on custom httpd log

Post by lgroschen »

Yes I did see the hyphens sitting there all lonesome. Glad you have it working!


Locked.
/Luke
Locked