Log Crashes every few hours

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Log Crashes every few hours

Post by bosecorp »

Every few hours my server crashes. this is what I see in the web page


Nagios Log Server - Waiting For Database Startup

but also see this in the logg


here is the error I see

==> /var/log/logstash/logstash.log <==
{:timestamp=>"2018-01-26T10:00:20.762000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:21.012000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:21.734000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:21.734000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:22.765000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:23.015000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:23.711000-0500", :message=>"SIGTERM received. Shutting down the agent.", :level=>:warn}
{:timestamp=>"2018-01-26T10:00:23.714000-0500", :message=>"stopping pipeline", :id=>"main"}
{:timestamp=>"2018-01-26T10:00:23.871000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-01-26T10:00:23.878000-0500", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}


in order to fix the issue I have to restart the services, but sometimes I need to restart the server
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Log Crashes every few hours

Post by mcapra »

It's be useful to see a full copy of the most recent ElasticSearch logs. I suspect something bad is happening there.
Former Nagios employee
https://www.mcapra.com/
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Log Crashes every few hours

Post by bosecorp »

here you go

[root@usvanagiosplog1 ~]# tail -n 50 /var/log/elasticsearch/*.log
==> /var/log/elasticsearch/26b23fda-7217-4701-86a6-e7ecb1f1c5c7_index_indexing_slowlog.log <==

==> /var/log/elasticsearch/26b23fda-7217-4701-86a6-e7ecb1f1c5c7_index_search_slowlog.log <==

==> /var/log/elasticsearch/26b23fda-7217-4701-86a6-e7ecb1f1c5c7.log <==
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:465)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:418)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.index.mapper.MapperParsingException: failed to parse date field [12/23/2017 1:25:52 PM], tried both date format [dateOptionalTime], and timestamp number with locale []
at org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:617)
at org.elasticsearch.index.mapper.core.DateFieldMapper.innerParseCreateField(DateFieldMapper.java:535)
at org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:239)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:401)
... 13 more
Caused by: java.lang.IllegalArgumentException: Invalid format: "12/23/2017 1:25:52 PM" is malformed at "/23/2017 1:25:52 PM"
at org.elasticsearch.common.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
at org.elasticsearch.common.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:780)
at org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:612)
... 16 more
[2018-01-26 10:26:56,273][DEBUG][action.bulk ] [adaef780-6e05-44fc-ad48-19087eccb89e] [logstash-2018.01.26][2] failed to execute bulk item (index) index {[logstash-2018.01.26][eventlog][AWEzFC7nX6sYOunaWmmZ], source[{"EventTime":"2018-01-23 13:25:53","Hostname":"MXTJMFGSOP01.bose.com","Keywords":-9223372036854775808,"EventType":"INFO","SeverityValue":2,"Severity":"INFO","EventID":1013,"SourceName":"Microsoft-Windows-Windows Defender","ProviderGuid":"{11CD958A-C507-4EF3-B3F2-5FD9DFBD2C78}","Version":0,"Task":0,"OpcodeValue":0,"RecordNumber":119,"ProcessID":5156,"ThreadID":4688,"Channel":"Microsoft-Windows-Windows Defender/Operational","Domain":"NT AUTHORITY","AccountName":"SYSTEM","UserID":"SYSTEM","AccountType":"User","Opcode":"Info","Product Name":"%%827","Product Version":"6.1.7601.18170","Timestamp":"12/24/2017 1:25:52 PM","User":"SYSTEM","SID":"S-1-5-18","EventReceivedTime":"2018-01-26 07:26:54","SourceModuleName":"eventlog","SourceModuleType":"im_msvistalog","message":"Windows Defender has removed history of spyware and other potentially unwanted software.\r\n \tTime:12/24/2017 1:25:52 PM\r\n \tUser:NT AUTHORITY\\SYSTEM\r\n","@version":"1","@timestamp":"2018-01-26T15:26:56.173Z","host":"10.244.20.94","port":64505,"type":"eventlog"}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [Timestamp]
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:411)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:706)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:465)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:418)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.index.mapper.MapperParsingException: failed to parse date field [12/24/2017 1:25:52 PM], tried both date format [dateOptionalTime], and timestamp number with locale []
at org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:617)
at org.elasticsearch.index.mapper.core.DateFieldMapper.innerParseCreateField(DateFieldMapper.java:535)
at org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:239)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:401)
... 13 more
Caused by: java.lang.IllegalArgumentException: Invalid format: "12/24/2017 1:25:52 PM" is malformed at "/24/2017 1:25:52 PM"
at org.elasticsearch.common.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
at org.elasticsearch.common.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:780)
at org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:612)
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Log Crashes every few hours

Post by mcapra »

All of the below assumes you are on Nagios Log Server 2.0+.

Hm, looks like ElasticSearch has, at some point, decided that the Timestamp field of your eventlog type is to be recognized as a date data*-type. Problem is it doesn't conform to a format that Joda likes.

If you're not currently using that field for anything, the lazy way to fix this is to remove it in a Logstash filter (using a mutate filter and remove_field action). Or adjust your nxlog rules to exclude it.

The correct way to fix this is to fiddle with the ElasticSearch templates, but that won't have any effect until the next day's index is fired up. There's probably a ticket buried somewhere where I've done this work/documentation already, but I don't have access to it and it's not exactly a short write-up.
Last edited by mcapra on Fri Jan 26, 2018 1:55 pm, edited 1 time in total.
Former Nagios employee
https://www.mcapra.com/
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Log Crashes every few hours

Post by dwhitfield »

mcapra wrote:All of the below assumes you are on Nagios Log Server 2.0+.
OP, can you give us that info? Thanks!
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Log Crashes every few hours

Post by bosecorp »

do you have the instructions on how to do that
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Log Crashes every few hours

Post by dwhitfield »

Which piece exactly? mutate is discussed on page 8 of https://assets.nagios.com/downloads/nag ... ilters.pdf
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Log Crashes every few hours

Post by bosecorp »

can you give what the filter is going to looks like?

also how do I tell if I am using that field?
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Log Crashes every few hours

Post by dwhitfield »

Well, ultimately, even if it's in the config, that doesn't mean the organization actually needs it. That's best determined on your end. That said, can you PM me the following:

everything in /var/log/elasticsearch/
everything in /var/log/logstash

profile: Admin -> System Status -> Download System Profile

UPDATE: PM received and files shared with techs
Last edited by dwhitfield on Mon Jan 29, 2018 10:04 am, edited 1 time in total.
Reason: pm received
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Log Crashes every few hours

Post by bosecorp »

just PM you the files
Locked