Page 2 of 3
Re: Logstash stopping again
Posted: Tue May 26, 2015 2:48 pm
by BanditBBS
eloyd wrote:Our box stops working from time to time, too. I gave up on figuring out why and had Nagios monitor the process and event handler restarts it when it dies. No more dead logstash.
Speaking of which, that idea is part of my talk for the 2015 Nagios World Conference....
Yeah Eric, thats my plan too, but would really like to know the cause though.
I'm not speaking this year, still coming, just didn't volunteer to speak.
Re: Logstash stopping again
Posted: Tue May 26, 2015 2:51 pm
by eloyd
See you in September!
Re: Logstash stopping again
Posted: Tue May 26, 2015 4:55 pm
by abrist
We would definitely like to hunt down the cause of this issue. Do you have a minimal easy path to reproduce it that we can test in house?
Re: Logstash stopping again
Posted: Tue May 26, 2015 6:10 pm
by BanditBBS
Andy,
Benhank, Eric and myself seem to have never found the exact cause. I wiped away my two nodes(which I think had stopped doing it actually, but wanted fresh installs after testing) and now that I am fresh installed I an experiencing it again. RHEL 6.5 clean install handed over to me, I run updates and it becomes 6.6. Download NLS and install. Only change is to run as root and enable port 514(or was that 554, whatever default syslog is). Other than that plain vanilla. It is interesting to note that both nodes were showing it as down, I can't recall if both stopped in the first install or not. I'll be paying attention, I have everything monitored now and no event handler installed yet, so I will know if it goes down again.
Re: Logstash stopping again
Posted: Wed May 27, 2015 9:17 am
by jolson
A question for those of you who have experienced this problem - have you all done manual installs, and have you all enabled privileged ports?
The manual install I'm not as concerned with, but it's possible that enabling privileges ports in Logstash has something to do with it.
Re: Logstash stopping again
Posted: Wed May 27, 2015 2:06 pm
by eloyd
All my installs were manual (new box, download, ./install). No privileged ports for me.
Re: Logstash stopping again
Posted: Wed May 27, 2015 3:50 pm
by jolson
I have set up a few testing environments to monitor for this sort of erratic behavior. Bandit, are you sending any non-standard or otherwise customized logs to NLS? Do you have any logstash inputs/filters/outputs defined customly? If you could get me a copy of your logstash config, that would be helpful.
Code: Select all
cat /usr/local/nagioslogserver/logstash/etc/conf.d/*
I'll see if logstash randomly crashes if installed manually vs imaged/if given root permissions.
Re: Logstash stopping again
Posted: Wed May 27, 2015 4:23 pm
by BanditBBS
My boxes are very very fresh...the only changes are what I've already stated. Yes I can sending syslog on 514 to it and thats about all it is receiving at the moment. Here is the config you asked for:
Code: Select all
cat /usr/local/nagioslogserver/logstash/etc/conf.d/*
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Thu, 14 May 2015 10:44:49 -0500
#
#
# Global inputs
#
input {
syslog {
type => 'syslog'
port => 5544
}
tcp {
type => 'eventlog'
port => 3515
codec => json {
charset => 'CP1252'
}
}
tcp {
type => 'import_raw'
tags => 'import_raw'
port => 2056
}
tcp {
type => 'import_json'
tags => 'import_json'
port => 2057
codec => json
}
syslog {
type => 'syslog'
port => 514
}
}
#
# Local inputs
#
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Thu, 14 May 2015 10:44:49 -0500
#
#
# Global filters
#
filter {
if [program] == 'apache_access' {
grok {
match => [ 'message', '%{COMBINEDAPACHELOG}']
}
date {
match => [ 'timestamp', 'dd/MMM/yyyy:HH:mm:ss Z' ]
}
mutate {
replace => [ 'type', 'apache_access' ]
convert => [ 'bytes', 'integer' ]
convert => [ 'response', 'integer' ]
}
}
if [program] == 'apache_error' {
grok {
match => [ 'message', '\[(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})\] \[%{WORD:class}\] \[%{WORD:originator} %{IP:clientip}\] %{GREEDYDATA:errmsg}']
}
mutate {
replace => [ 'type', 'apache_error' ]
}
}
}
#
# Local filters
#
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Thu, 14 May 2015 10:44:49 -0500
#
#
# Required output for Nagios Log Server
#
output {
elasticsearch {
cluster => '39fd1a19-2038-485c-8782-180c8be9a886'
host => 'localhost'
index_type => '%{type}'
node_name => 'cb57677a-dfa9-4822-91d1-edcbb768e56e'
protocol => 'transport'
workers => 4
}
}
#
# Global outputs
#
#
# Local outputs
#
Re: Logstash stopping again
Posted: Thu May 28, 2015 7:15 am
by GhostRider2110
Logstash died on me last night, it has been a while since it happened. I'm running off the VM image from Nagios. On a side note, I'm waiting to hear back about approval to go to the conference, if so look forward to meeting some of you folks.
Back to our regularly scheduled programming, I also plan on setting up an event handler to restart it if it stops, but as well would like to know the root cause. My logstatsh.log mainly filled with this:
Code: Select all
{:timestamp=>"2015-05-28T02:18:26.636000-0400", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"May 28 02:16:14", :exception=>java.lang.IllegalArgumentException: Invalid format: "May 28 02:16:14", :level=>:warn}
{:timestamp=>"2015-05-28T02:18:26.637000-0400", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"May 28 02:16:14", :exception=>java.lang.IllegalArgumentException: Invalid format: "May 28 02:16:14", :level=>:warn}
{:timestamp=>"2015-05-28T02:18:26.639000-0400", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"May 28 02:16:14", :exception=>java.lang.IllegalArgumentException: Invalid format: "May 28 02:16:14", :level=>:warn}
{:timestamp=>"2015-05-28T02:18:26.640000-0400", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"May 28 02:16:14", :exception=>java.lang.IllegalArgumentException: Invalid format: "May 28 02:16:14", :level=>:warn}
{:timestamp=>"2015-05-28T02:18:26.640000-0400", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"May 28 02:16:14", :exception=>java.lang.IllegalArgumentException: Invalid format: "May 28 02:16:14", :level=>:warn}
With one different entry over the last 24hours of:
Code: Select all
:timestamp=>"2015-05-28T07:50:51.228000-0400", :message=>"Using milestone 1 input plugin 'syslog'. This plugin should work, but would benefit from use by folks like you. Please let us know if you find bugs or have suggestions on how to improve this plugin. For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones", :level=>:warn}
{:timestamp=>"2015-05-28T07:50:51.274000-0400", :message=>"Using milestone 2 input plugin 'tcp'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones", :level=>:warn}
Not trying to hijack the thread....
See-ya
Mitch
Re: Logstash stopping again
Posted: Thu May 28, 2015 7:32 am
by eloyd
Mitch,
Looking forward to seeing you at the conference if you can make it! You can pester Nagios developers in real time in real life, so tell your financial decision maker that it's well worth it!
