Logstash logs - growing too big, too fast.

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Logstash logs - growing too big, too fast.

Post by polarbear1 »

The logs are coming from a Windows box, using nxlog.

Also - since the screenshot of the globalconfig, I removed the line about parsing DATESTAMP - that was something else I was playing around with that I don't need anymore. So now we're just trying to parse SEVERITY out of the message.

Attached are 2 types of messages, so you have an idea of what it looks like. The first is a typical message with a valid severity that has been parsed using that grok filter. The second has a "_grokfailure" tag attached to it and no valid severity to be parsed out - as is true with a good chunk of the info messages, they are output with no severity in the body of the message by my production process. Don't know if related, but throwing it in there for full disclosure.
You do not have the required permissions to view the files attached to this post.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Logstash logs - growing too big, too fast.

Post by rkennedy »

polarbear1 wrote:The logs are coming from a Windows box, using nxlog.

Also - since the screenshot of the globalconfig, I removed the line about parsing DATESTAMP - that was something else I was playing around with that I don't need anymore. So now we're just trying to parse SEVERITY out of the message.

Attached are 2 types of messages, so you have an idea of what it looks like. The first is a typical message with a valid severity that has been parsed using that grok filter. The second has a "_grokfailure" tag attached to it and no valid severity to be parsed out - as is true with a good chunk of the info messages, they are output with no severity in the body of the message by my production process. Don't know if related, but throwing it in there for full disclosure.
In the first screenshot, it looks like the GROK isn't working as it doesn't know what to parse into severity_label. Which value are you looking to populate the field? By the way - both of these log files appear very differently so the same GROK might not work. Usually, the fields will be populated with a static part of a log, but I'm not sure this is going to be the case with your logs.
Former Nagios Employee
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Logstash logs - growing too big, too fast.

Post by polarbear1 »

As you can see from the previous screenshot, the Grok config in question is:

Code: Select all

if [SourceModuleName] == 'iso' {
    grok {
match => ['message', '%{LOGLEVEL1:severity_label}']
    }
}
Of course, it would help to know what "LOGLEVEL1" is as that's a custom pattern. And yes, I have it defined on all servers in this cluster.

Code: Select all

[root@schpnag1 ~]# cat /usr/local/nagioslogserver/logstash/patterns/grok-patterns | grep LOGLEVEL1
LOGLEVEL1 (Trace|Debug|Warning|Info|Critical|Error)
As for the log files, yes they are different. In the most general way - there is a bunch of applications that process data from different sources. Each source has it's own processor application. All these applications (since they more or less do the same thing) dump their log files into a directory that I have NXLOG configured to pick up from - whatever is in there. The log output format on the applications is NOT identical, this is known and we are working to standardize it a little better, but it should be close enough. I try to parse out severity where possible, but as in the example provided earlier, not every message has a severity to parse out.

Here's the relevant bit of the nxlog.conf

Code: Select all

<Input iso>
    Module   im_file
    File     'D:\DataServices\DataPrograms\DataSolutions\Logs\*[a-z].log'
    SavePos  TRUE
    Exec     $Message = $raw_event; $Hostname=hostname(); $Program=file_basename(file_name());
</Input>
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Logstash logs - growing too big, too fast.

Post by rkennedy »

The underlying issue is, when you apply the grok filter to the field SourceModuleName = iso, if the log doesn't have anything to match it's going to failure. When NLS does grok filtering, it needs a common way to differentiate the logs.

Could you post a few different raw outputs of the logs so we can see if this is going to be possible?
Former Nagios Employee
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Logstash logs - growing too big, too fast.

Post by polarbear1 »

rkennedy wrote:when you apply the grok filter to the field SourceModuleName = iso, if the log doesn't have anything to match it's going to failure. When NLS does grok filtering, it needs a common way to differentiate the logs.
Can you clarify that. I was under the impression (and this is my objective - if there is a better way to do this, let me know) that...

In nxlog.conf specifying "<Input X>" for the section will forward X as the SourceModuleName. Then when I tell grok "if [SourceModuleName] == 'X'" it will pick out messages from logs coming from my <Input X> , ignoring all else (and I am assuming you mean these are the ones going to failure), and then it does whatever the grok pattern tells it to do - in this case, match message to parse out severity. At which point if my severity exists (Trace|Debug|Warning|Info|Critical|Error) then it is parsed and stuck into the severity_label field, otherwise there is nothing and the severity_label remains blank.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Logstash logs - growing too big, too fast.

Post by rkennedy »

then it does whatever the grok pattern tells it to do - in this case, match message to parse out severity. At which point if my severity exists (Trace|Debug|Warning|Info|Critical|Error) then it is parsed and stuck into the severity_label field, otherwise there is nothing and the severity_label remains blank.
Yes. It's matching everything you send in as iso. After it matches, it does exactly what you mention above. Since it can't apply the grok filter properly to the message field 20160808 16:24:36 38 jobs queued for priority: 4, it then tags it with _grokparsefailure. Looking at your fields, you could add another if statement to check what the Program is equal to, since I'm guessing each log file will be unique this is one way to separate them.
Former Nagios Employee
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Logstash logs - growing too big, too fast.

Post by polarbear1 »

Looking at your fields, you could add another if statement to check what the Program is equal to, since I'm guessing each log file will be unique this is one way to separate them.
Yes, and sadly no. The reason for me wildcarding the folder is that it's a pretty dynamic environment. New log files will show up and old log files will be phased out on a fairly frequent basis that to keep up with it by hard-coding any specific Program names would be a giant pain. On top of that, any particular program's logfile may have lines with severity, and it may have records without.

For testing purposes going into this weekend I'll pull out that Grok filter entirely. Let's see if that makes logstash less spammy (and crashy).


Or another option (if this is necessary) - can I modify my filter in some way that tells it to do what it's doing, but if there is no match, then do something else. Basically, a graceful exit instead of a failure?
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Logstash logs - growing too big, too fast.

Post by rkennedy »

Yes, you should be able to apply a conditional statement matching with regex. One thing I noticed looking back, is the initial log wasn't about the match for your severity_label, but rather the data. (looks to be because of two spaces on the date match)

Code: Select all

:message=>"Failed parsing date from field", :field=>"timestamp", :value=>"Aug  1 03:45:05", :exception=>java.lang.IllegalArgumentException: Invalid format: "Aug  1 03:45:05", :level=>:warn}
Could you run tail -n50 /var/log/logstash/logstash.log and post the output? I'd like to see what error(s) are all rotating, so we know what we need to match.
Former Nagios Employee
Locked