Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 9:57 am
by weveland
Dear Support,
It appears that a previous issue we had with logstash stopping lumberjack input processing has recurred. I was able to narrow down the incoming messages that triggered the crash to an exploit attempt against a outside facing mail server.
When the messages were ingested it caused logstash to stop processing input in the lumberjack protocol. Other syslog messages and such were not delayed, just the lumberjack input.
This means an outside person was able to stop our logging system for several hours until we noticed the issue. I am attaching the messages to this email in archive format so you can examine them. This is issue has been upgraded to critical status on our end.
-Wayne
logstash-forwarder.errors.tar.gz
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 10:55 am
by jolson
I was taking a look around the internet, and found these release notes:
https://www.elastic.co/blog/logstash-1-5-2-released
Lumberjack input: This input was not handling backpressure properly from downstream plugins and would not timeout client connections, causing clients to constantly reconnect and eventually cause Logstash to run out of memory. We added a circuit breaker to stop accepting new connections when we detect this situation. Please note that `max_clients` setting introduced in v0.1.9 has been deprecated. This setting temporarily solved the problem by configuring an upper limit to the number of LSF connections (#12).
Related github bug:
https://github.com/elastic/logstash/issues/3277
Regarding your bug, I scoured github and found some similar ones:
https://github.com/logstash-plugins/log ... /issues/15
Notably:
From reading the thread in #1489 the root problem seem lack of congestion control, which we now have with a timeout sizequeue and a circuit breaker. Since logstash 1.5.2
Since we're at Logstash version 1.5.1, I think the above is compelling evidence to move forward with an upgrade:
Code: Select all
# Backup current LS in case of later failure
mkdir ~/logstashbackup
service logstash stop
cd /usr/local/nagioslogserver/logstash/
for i in $(ls | grep -v etc) ; do mv $i ~/logstashbackup ; done
# Upgrade LS to 1.5.2
cd /tmp
wget https://download.elastic.co/logstash/logstash/logstash-1.5.2.tar.gz
tar zxf logstash-1.5.2.tar.gz
cp logstash-1.5.2/* -R -p /usr/local/nagioslogserver/logstash/
chown nagios.nagios -R /usr/local/nagioslogserver/logstash/*
service logstash start
The above worked on my test server with no ill effects - though it's worth keeping in mind that the above is experimental. We can always revert to the previous version of Logstash by following the above instructions and targeting the 1.5.1 target instead of 1.5.2.
I think that giving the above a try is worth a try due to the resources I've pointed out above.
Thanks Wayne, let me know if the above works for you - thanks!
Jesse
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 11:30 am
by weveland
I will try that today, probably right after I get out of my meeting.
Thanks!
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 11:33 am
by jolson
Sounds good, looking forward to your testing.
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 12:02 pm
by weveland
Also as another thought. So you still think this isn't related to the specific messages? Just to congestion?
It could make sense as we did maintenance that disconnected some remote devices from the log server for several hours. However I didn't see a ramp-up after the maintenance which I would expect to have seen leading up to a crash.
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 12:48 pm
by jolson
Also as another thought. So you still think this isn't related to the specific messages? Just to congestion?
Absolutely possible. I am not decided one way or the other, but the vast majority of lumberjack bugs on github are related to congestion and timeouts hanging or similar problems. The circuit breaker in 1.5.2 seems like the end-all solution when it comes to lumberjack congestion.
However I didn't see a ramp-up after the maintenance which I would expect to have seen leading up to a crash.
I would expect to see that too, I'm wondering if all of the logs overwhelm logstash so much that it can't get anything through the output.
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 12:59 pm
by weveland
Upgrade completed. Sorry it took so long, internal protocol dictates I snapshot the VM before upgrading anything. So I had to take it offline for a bit.
After it came back up I see the following.
{:timestamp=>"2016-02-01T12:43:26.996000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:27.498000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:28.000000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:28.510000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:29.013000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:29.514000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:30.016000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:30.521000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:31.023000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2016-02-01T12:43:31.524000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
But on the forwarder I see it accepting the messages after showing socket errors from a broken pipe. Looks like it waited 30s then tried again and was able to connect.
I replayed the log entries back into the log to test.
This is what I saw on the server
{:timestamp=>"2016-02-01T12:43:39.376000-0500", :message=>"CircuitBreaker::Close", :name=>"Lumberjack input", :level=>:warn}
{:timestamp=>"2016-02-01T12:57:23.832000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/13 18:37:11 [info] 28786}
{:timestamp=>"2016-02-01T12:57:23.833000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/13 18:37:11 [info] 28786}
{:timestamp=>"2016-02-01T12:57:23.835000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/15 14:52:36 [info] 28785}
{:timestamp=>"2016-02-01T12:57:23.836000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/15 14:52:37 [info] 28785}
{:timestamp=>"2016-02-01T12:57:23.838000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/16 08:54:28 [info] 28784}
{:timestamp=>"2016-02-01T12:57:23.841000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/16 08:54:28 [info] 28784}
{:timestamp=>"2016-02-01T12:57:23.842000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/16 21:01:07 [info] 28784}
{:timestamp=>"2016-02-01T12:57:23.843000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/16 21:01:08 [info] 28784}
{:timestamp=>"2016-02-01T12:57:23.845000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/18 01:01:28 [info] 28786}
{:timestamp=>"2016-02-01T12:57:23.845000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/18 01:01:29 [info] 28786}
{:timestamp=>"2016-02-01T12:57:23.847000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/18 18:27:16 [info] 28787}
{:timestamp=>"2016-02-01T12:57:23.848000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/18 18:27:17 [info] 28784}
{:timestamp=>"2016-02-01T12:57:23.913000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/19 12:44:22 [info] 28785}
{:timestamp=>"2016-02-01T12:57:23.917000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/19 12:44:24 [info] 28785}
{:timestamp=>"2016-02-01T12:57:23.971000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/22 15:48:44 [info] 28787}
{:timestamp=>"2016-02-01T12:57:23.971000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/22 15:48:44 [info] 28787}
{:timestamp=>"2016-02-01T12:57:24.042000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/23 18:25:06 [info] 28785}
{:timestamp=>"2016-02-01T12:57:24.045000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/23 18:25:07 [info] 28785}
{:timestamp=>"2016-02-01T12:57:24.108000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/24 13:44:52 [info] 28784}
{:timestamp=>"2016-02-01T12:57:24.112000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/24 13:44:52 [info] 28784}
{:timestamp=>"2016-02-01T12:57:24.164000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/25 02:56:24 [info] 28785}
{:timestamp=>"2016-02-01T12:57:24.165000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/25 02:56:25 [info] 28785}
{:timestamp=>"2016-02-01T12:57:24.232000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/25 21:34:59 [info] 28784}
{:timestamp=>"2016-02-01T12:57:24.234000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/25 21:34:59 [info] 28784}
{:timestamp=>"2016-02-01T12:57:24.287000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/26 09:02:36 [info] 28786}
{:timestamp=>"2016-02-01T12:57:24.296000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/26 09:02:37 [info] 28786}
{:timestamp=>"2016-02-01T12:57:24.353000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/27 21:07:23 [info] 28786}
{:timestamp=>"2016-02-01T12:57:24.355000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/27 21:07:24 [info] 28786}
{:timestamp=>"2016-02-01T12:57:24.410000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/28 00:20:44 [info] 23950}
{:timestamp=>"2016-02-01T12:57:24.420000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/28 00:20:44 [info] 23950}
{:timestamp=>"2016-02-01T12:57:24.481000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"2016/01/28 17:27:57 [info] 23950}
However log processing continues.
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 1:13 pm
by jolson
However log processing continues.
Good to hear, maybe the circuit breaker stopped the process from spinning out of control.
As for
Received an event that has a different character encoding than you configured, this happens due to the non-UTF8 characters being sent from the remote side. Notably:
The bad characters don't actually show up in my web browser, I have replaced those characters with question marks above - open up logstash-forwarder.errors in atom/sublimetext to see the characters as they are.
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Mon Feb 01, 2016 2:16 pm
by weveland
jolson wrote:
As for
Received an event that has a different character encoding than you configured, this happens due to the non-UTF8 characters being sent from the remote side. Notably:
Do you think there's some issue logstash has processing these url formatted control characters? While this may be a fix to keep things running. Is there a deeper problem here in how logstsash handles them?
Would it be best to try and trim non-printable characters from input? before they get pipelined in logstash?
--
Wayne
Re: Logstash Lumberjack Input crash: Security Issue
Posted: Tue Feb 02, 2016 1:34 pm
by jolson
It's very hard to determine exactly what's going on here - I think a Logstash bug report is likely in order. Would you like me to file one on your behalf? What information is fine/not fine to include in my report of the files you sent me? I imagine that all files you've posted publicly are good to go?