Oof, here's a long post.
Two things I see:
First, there is some junk data trying to make it's way into an Index. Specifically some record(s) with the
TargetFileName field:
Code: Select all
[2017-10-11 03:24:49,511][DEBUG][action.bulk ] [330efcd2-34fc-4f7f-9cba-df89a1374eee] [logstash-2017.10.11][1] failed to execute bulk item (index) index {[logstash-2017.10.11][mcafee][AV8KUlbhOgRVuukGN1iA], source[{"message":"<?xml version=\"1.0\" encoding=\"UTF-8\"?><EPOEvent><MachineInfo><MachineName>D47661JC4</MachineName><AgentGUID>{de1c5a4c-7c0e-11e6-2d91-3cd92b5c8d2f}</AgentGUID><IPAddress>10.1.6.10</IPAddress><OSName>Windows 7</OSName><UserName>SYSTEM</UserName><TimeZoneBias>240</TimeZoneBias><RawMACAddress>3cd92b5c8d2f</RawMACAddress></MachineInfo><SoftwareInfo ProductName=\"ePO Deep Command\" ProductVersion=\"2.4.1.465\" ProductFamily=\"Secure\"><Event><EventID>34362</EventID><Severity>2</Severity><GMTTime>2017-10-11T07:22:05</GMTTime><CommonFields><Analyzer>AMTMGMT_1000</Analyzer><AnalyzerDetectionMethod>1</AnalyzerDetectionMethod><AnalyzerName>ePO Deep Command</AnalyzerName><AnalyzerVersion>2.4.1.465</AnalyzerVersion><TargetFileName>System is not capable of HBC, ignoring the enforcement and sending an event</TargetFileName><ThreatCategory>ops</ThreatCategory><ThreatName>Configure failure</ThreatName><ThreatType>AMT</ThreatType></CommonFields></Event></SoftwareInfo></EPOEvent>\r","@version":"1","@timestamp":"2017-10-11T07:22:38.018Z","host":"10.10.41.47","type":"mcafee","parsed":{"MachineInfo":[{"MachineName":["D47661JC4"],"AgentGUID":["{de1c5a4c-7c0e-11e6-2d91-3cd92b5c8d2f}"],"IPAddress":["10.1.6.10"],"OSName":["Windows 7"],"UserName":["SYSTEM"],"TimeZoneBias":["240"],"RawMACAddress":["3cd92b5c8d2f"]}],"SoftwareInfo":[{"ProductName":"ePO Deep Command","ProductVersion":"2.4.1.465","ProductFamily":"Secure","Event":[{"EventID":["34362"],"Severity":["2"],"GMTTime":["2017-10-11T07:22:05"],"CommonFields":[{"Analyzer":["AMTMGMT_1000"],"AnalyzerDetectionMethod":["1"],"AnalyzerName":["ePO Deep Command"],"AnalyzerVersion":["2.4.1.465"],"TargetFileName":["System is not capable of HBC, ignoring the enforcement and sending an event"],"ThreatCategory":["ops"],"ThreatName":["Configure failure"],"ThreatType":["AMT"]}]}]}]}}]}
org.elasticsearch.index.mapper.MapperParsingException: object mapping [TargetFileName] trying to serialize a value with no field associated with it, current value [System is not capable of HBC, ignoring the enforcement and sending an event]
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:702)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:706)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeNonDynamicArray(ObjectMapper.java:695)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:604)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:489)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:554)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeNonDynamicArray(ObjectMapper.java:685)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:604)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:489)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:554)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeNonDynamicArray(ObjectMapper.java:685)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:604)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:489)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:554)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeNonDynamicArray(ObjectMapper.java:685)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:604)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:489)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:554)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:487)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:466)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:418)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Second, there's some very aggressive garbage collection happening within ElasticSearch (might be a direct result of the junk data):
Code: Select all
[2017-10-11 03:29:09,929][WARN ][monitor.jvm ] [330efcd2-34fc-4f7f-9cba-df89a1374eee] [gc][old][395927][16495] duration [1m], collections [1]/[1m], total [1m]/[1.3d], memory [31.7gb]->[30.3gb]/[31.8gb], all_pools {[young] [1.4gb]->[258.4mb]/[1.4gb]}{[survivor] [162.2mb]->[0b]/[191.3mb]}{[old] [30.1gb]->[30.1gb]/[30.1gb]}
[2017-10-11 03:30:12,295][WARN ][monitor.jvm ] [330efcd2-34fc-4f7f-9cba-df89a1374eee] [gc][old][395935][16496] duration [54.9s], collections [1]/[55.3s], total [54.9s]/[1.3d], memory [31.7gb]->[30.4gb]/[31.8gb], all_pools {[young] [1.4gb]->[339.8mb]/[1.4gb]}{[survivor] [140.3mb]->[0b]/[191.3mb]}{[old] [30.1gb]->[30.1gb]/[30.1gb]}
[2017-10-11 03:31:07,238][WARN ][monitor.jvm ] [330efcd2-34fc-4f7f-9cba-df89a1374eee] [gc][old][395938][16497] duration [52.1s], collections [1]/[52.2s], total [52.1s]/[1.3d], memory [31.7gb]->[30.4gb]/[31.8gb], all_pools {[young] [1.4gb]->[320.2mb]/[1.4gb]}{[survivor] [174.8mb]->[0b]/[191.3mb]}{[old] [30.1gb]->[30.1gb]/[30.1gb]}
[2017-10-11 03:32:05,556][WARN ][monitor.jvm ] [330efcd2-34fc-4f7f-9cba-df89a1374eee] [gc][old][395941][16498] duration [55.7s], collections [1]/[56.2s], total [55.7s]/[1.3d], memory [31.7gb]->[30.4gb]/[31.8gb], all_pools {[young] [1.4gb]->[351.9mb]/[1.4gb]}{[survivor] [96.5mb]->[0b]/[191.3mb]}{[old] [30.1gb]->[30.1gb]/[30.1gb]}
[2017-10-11 03:33:08,117][WARN ][monitor.jvm ] [330efcd2-34fc-4f7f-9cba-df89a1374eee] [gc][old][395944][16499] duration [59.8s], collections [1]/[1m], total [59.8s]/[1.3d], memory [31.7gb]->[30.4gb]/[31.8gb], all_pools {[young] [1.4gb]->[322.1mb]/[1.4gb]}{[survivor] [114mb]->[0b]/[191.3mb]}{[old] [30.1gb]->[30.1gb]/[30.1gb]}
It seems as though your McAfee logs are sending some data that ElasticSearch isn't able to gracefully handle. Here's one of the records that has upset ElasticSearch:
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<EPOEvent>
<MachineInfo>
<MachineName>D476606CD</MachineName>
<AgentGUID>{6F1B7A0F-4794-4E3B-883F-342FC4E73ECF}</AgentGUID>
<IPAddress>10.1.6.11</IPAddress>
<OSName>Windows 7</OSName>
<UserName>tthompson</UserName>
<TimeZoneBias>240</TimeZoneBias>
<RawMACAddress>6c3be528c344</RawMACAddress>
</MachineInfo>
<SoftwareInfo ProductName="ePO Deep Command" ProductVersion="2.4.1.465" ProductFamily="Secure">
<Event>
<EventID>34362</EventID>
<Severity>2</Severity>
<GMTTime>2017-10-11T07:31:16</GMTTime>
<CommonFields>
<Analyzer>AMTMGMT_1000</Analyzer>
<AnalyzerDetectionMethod>1</AnalyzerDetectionMethod>
<AnalyzerName>ePO Deep Command</AnalyzerName>
<AnalyzerVersion>2.4.1.465</AnalyzerVersion>
<TargetFileName>System is not capable of HBC, ignoring the enforcement and sending an event</TargetFileName>
<ThreatCategory>ops</ThreatCategory>
<ThreatName>Configure failure</ThreatName>
<ThreatType>AMT</ThreatType>
</CommonFields>
</Event>
</SoftwareInfo>
</EPOEvent>
ThreatName also seems to cause problems later on in the log. I don't see any obvious patterns when cross-referencing the two fields.
A few questions:
Is there a particular reason for the mutate step in this filter? Just curious, I don't think it's related to your problem at all:
Code: Select all
if [type] == 'mcafee' {
mutate {
gsub => [
'message', '^<.*]\s', ''
]
}
xml {
source => 'message'
target => 'parsed'
}
}
Can you also share the output of these commands executed from the CLI of one of your Nagios Log Server machines:
Code: Select all
curl -XGET 'http://localhost:9200/logstash-2017.10.11/_mapping/mcafee'
curl -XGET 'http://localhost:9200/logstash-2017.10.11/_mapping'
I suspect there's issues with how one or more of those fields are mapped.