Page 1 of 3

Log Crashes every few hours

Posted: Fri Jan 26, 2018 10:10 am
by bosecorp
Every few hours my server crashes. this is what I see in the web page


Nagios Log Server - Waiting For Database Startup

but also see this in the logg


here is the error I see

==> /var/log/logstash/logstash.log <==
{:timestamp=>"2018-01-26T10:00:20.762000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:21.012000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:21.734000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:21.734000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:22.765000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:23.015000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:23.711000-0500", :message=>"SIGTERM received. Shutting down the agent.", :level=>:warn}
{:timestamp=>"2018-01-26T10:00:23.714000-0500", :message=>"stopping pipeline", :id=>"main"}
{:timestamp=>"2018-01-26T10:00:23.871000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:23.878000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}


in order to fix the issue I have to restart the services, but sometimes I need to restart the server

Re: Log Crashes every few hours

Posted: Fri Jan 26, 2018 10:18 am
by mcapra
It's be useful to see a full copy of the most recent ElasticSearch logs. I suspect something bad is happening there.

Re: Log Crashes every few hours

Posted: Fri Jan 26, 2018 10:36 am
by bosecorp
here you go

[root@usvanagiosplog1 ~]# tail -n 50 /var/log/elasticsearch/*.log
==> /var/log/elasticsearch/26b23fda-7217-4701-86a6-e7ecb1f1c5c7_index_indexing_slowlog.log <==

==> /var/log/elasticsearch/26b23fda-7217-4701-86a6-e7ecb1f1c5c7_index_search_slowlog.log <==

==> /var/log/elasticsearch/26b23fda-7217-4701-86a6-e7ecb1f1c5c7.log <==
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:465)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:418)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.index.mapper.MapperParsingException: failed to parse date field [12/23/2017 1:25:52 PM], tried both date format [dateOptionalTime], and timestamp number with locale []
at org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:617)
at org.elasticsearch.index.mapper.core.DateFieldMapper.innerParseCreateField(DateFieldMapper.java:535)
at org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:239)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:401)
... 13 more
Caused by: java.lang.IllegalArgumentException: Invalid format: "12/23/2017 1:25:52 PM" is malformed at "/23/2017 1:25:52 PM"
at org.elasticsearch.common.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
at org.elasticsearch.common.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:780)
at org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:612)
... 16 more
[2018-01-26 10:26:56,273][DEBUG][action.bulk ] [adaef780-6e05-44fc-ad48-19087eccb89e] [logstash-2018.01.26][2] failed to execute bulk item (index) index {[logstash-2018.01.26][eventlog][AWEzFC7nX6sYOunaWmmZ], source[{"EventTime":"2018-01-23 13:25:53","Hostname":"MXTJMFGSOP01.bose.com","Keywords":-9223372036854775808,"EventType":"INFO","SeverityValue":2,"Severity":"INFO","EventID":1013,"SourceName":"Microsoft-Windows-Windows Defender","ProviderGuid":"{11CD958A-C507-4EF3-B3F2-5FD9DFBD2C78}","Version":0,"Task":0,"OpcodeValue":0,"RecordNumber":119,"ProcessID":5156,"ThreadID":4688,"Channel":"Microsoft-Windows-Windows Defender/Operational","Domain":"NT AUTHORITY","AccountName":"SYSTEM","UserID":"SYSTEM","AccountType":"User","Opcode":"Info","Product Name":"%%827","Product Version":"6.1.7601.18170","Timestamp":"12/24/2017 1:25:52 PM","User":"SYSTEM","SID":"S-1-5-18","EventReceivedTime":"2018-01-26 07:26:54","SourceModuleName":"eventlog","SourceModuleType":"im_msvistalog","message":"Windows Defender has removed history of spyware and other potentially unwanted software.\r\n \tTime:12/24/2017 1:25:52 PM\r\n \tUser:NT AUTHORITY\\SYSTEM\r\n","@version":"1","@timestamp":"2018-01-26T15:26:56.173Z","host":"10.244.20.94","port":64505,"type":"eventlog"}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [Timestamp]
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:411)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:706)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:465)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:418)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.index.mapper.MapperParsingException: failed to parse date field [12/24/2017 1:25:52 PM], tried both date format [dateOptionalTime], and timestamp number with locale []
at org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:617)
at org.elasticsearch.index.mapper.core.DateFieldMapper.innerParseCreateField(DateFieldMapper.java:535)
at org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:239)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:401)
... 13 more
Caused by: java.lang.IllegalArgumentException: Invalid format: "12/24/2017 1:25:52 PM" is malformed at "/24/2017 1:25:52 PM"
at org.elasticsearch.common.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
at org.elasticsearch.common.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:780)
at org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:612)

Re: Log Crashes every few hours

Posted: Fri Jan 26, 2018 12:15 pm
by mcapra
All of the below assumes you are on Nagios Log Server 2.0+.

Hm, looks like ElasticSearch has, at some point, decided that the Timestamp field of your eventlog type is to be recognized as a date data*-type. Problem is it doesn't conform to a format that Joda likes.

If you're not currently using that field for anything, the lazy way to fix this is to remove it in a Logstash filter (using a mutate filter and remove_field action). Or adjust your nxlog rules to exclude it.

The correct way to fix this is to fiddle with the ElasticSearch templates, but that won't have any effect until the next day's index is fired up. There's probably a ticket buried somewhere where I've done this work/documentation already, but I don't have access to it and it's not exactly a short write-up.

Re: Log Crashes every few hours

Posted: Fri Jan 26, 2018 12:20 pm
by dwhitfield
mcapra wrote:All of the below assumes you are on Nagios Log Server 2.0+.
OP, can you give us that info? Thanks!

Re: Log Crashes every few hours

Posted: Fri Jan 26, 2018 3:17 pm
by bosecorp
do you have the instructions on how to do that

Re: Log Crashes every few hours

Posted: Fri Jan 26, 2018 3:30 pm
by dwhitfield
Which piece exactly? mutate is discussed on page 8 of https://assets.nagios.com/downloads/nag ... ilters.pdf

Re: Log Crashes every few hours

Posted: Fri Jan 26, 2018 3:40 pm
by bosecorp
can you give what the filter is going to looks like?

also how do I tell if I am using that field?

Re: Log Crashes every few hours

Posted: Fri Jan 26, 2018 4:33 pm
by dwhitfield
Well, ultimately, even if it's in the config, that doesn't mean the organization actually needs it. That's best determined on your end. That said, can you PM me the following:

everything in /var/log/elasticsearch/
everything in /var/log/logstash

profile: Admin -> System Status -> Download System Profile

UPDATE: PM received and files shared with techs

Re: Log Crashes every few hours

Posted: Mon Jan 29, 2018 8:43 am
by bosecorp
just PM you the files