Not recieving logs after 2.0 upgrade

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
bpizzutiWHI
Posts: 64
Joined: Thu Mar 02, 2017 10:15 am

Not recieving logs after 2.0 upgrade

Post by bpizzutiWHI »

It was working fine until that. Not sure what's going on, a lot of java errors in the ElasticSearch logs. Log files attached.
You do not have the required permissions to view the files attached to this post.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Not recieving logs after 2.0 upgrade

Post by dwhitfield »

I suspect the following kb will help resolve the issue: https://support.nagios.com/kb/article/n ... 0-778.html

Let us know if that doesn't do it for you.
bpizzutiWHI
Posts: 64
Joined: Thu Mar 02, 2017 10:15 am

Re: Not recieving logs after 2.0 upgrade

Post by bpizzutiWHI »

I just ran the upgrade this morning, so I probably already had the updated script. Anyway, it didn't work.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Not recieving logs after 2.0 upgrade

Post by mcapra »

Non-trivial though it may be, there should probably be a KB article written for this problem as it comes up quite frequently.

The gist of the issue (I truncated the source field):

Code: Select all

[2017-11-21 06:23:33,051][DEBUG][action.bulk              ] [a986f886-0c32-4cd2-9b56-95654f734914] [logstash-2017.11.21][1] failed to execute bulk item (index) index {[logstash-2017.11.21][eventlog][AV_eUaM61aUgoBl5yKfr], source[{ ... "ErrorCode":"0x0" ... }]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [ErrorCode]
Within ElasticSearch, fields have datatypes. I'd bet your ErrorCode field for the eventlog type is mapped as a long. 0x0 is not a valid long; But it is a valid string.

The output of this command (executed from the CLI of any of your Nagios Log Server instances) should help us verify:

Code: Select all

curl -XGET 'http://localhost:9200/logstash-2017.11.21/_mapping'
A totally not related to your problem summary of some mapping concepts and how they affect Nagios Log Server's sorting:
https://support.nagios.com/forum/viewto ... 99#p220799
Former Nagios employee
https://www.mcapra.com/
bpizzutiWHI
Posts: 64
Joined: Thu Mar 02, 2017 10:15 am

Re: Not recieving logs after 2.0 upgrade

Post by bpizzutiWHI »

You're right, it's mapped as long.

And I'm not authorized to access the other link. I assume I need to change the mapping to string? :)
You do not have the required permissions to view the files attached to this post.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Not recieving logs after 2.0 upgrade

Post by dwhitfield »

Missing some screnshots, but perhaps it will still be useful (even without the context):
Are you able to share the day's index mappings from when this occurred? Like so if the issue occurred on May 11th:

Code: Select all
curl -XGET 'http://localhost:9200/logstash-2017.05.11/_mapping'



Can you also tell us which values/fields specifically you're referring to?

gsl_ops_practice wrote:
So it looks like the conversion to INT isn't happening properly.



%{INT} represents a grok pattern, not a field type (not explicitly, anyway). So if I say %{INT:some_field}, then some_field will match the INT grok pattern but not necessarily be stored as an int/integer variable. If you wanted some field to be a specific data type (we'll use long because it's easy) your pattern match in the grok filter would have to look like %{INT:some_field:long} to properly type the field in that instant.

gsl_ops_practice wrote:
As per your code I am not seeing any white spaces anymore and it all looks good. Until I try to display those values over time. When I do, I get this error in the GUI:



I assume this to mean that you are trying to "Sort By" a specific field in the GUI? Here's an example event:
curl -XGET 'http://localhost:9200/logstash-2017.05. ... rch?size=1'
https://pastebin.com/YV40958z

Code: Select all
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 16801,
"max_score": 1.0,
"hits": [{
"_index": "logstash-2017.05.11",
"_type": "eventlog",
"_id": "AVv0zkiDLoUjsjJ7dByf",
"_score": 1.0,
"_source": {
"EventTime": "2017-05-11 01:59:43",
"Hostname": "WIN-NFRUUIO4D46.DOMAIN.local",
"Keywords": -9223372036854775808,
"EventType": "WARNING",
"SeverityValue": 3,
"Severity": "WARNING",
"EventID": 322,
"SourceName": "Microsoft-Windows-TaskScheduler",
"ProviderGuid": "{DE7B24EA-73C8-4A09-985D-5BDADCFA9017}",
"Version": 0,
"Task": 322,
"OpcodeValue": 0,
"RecordNumber": 1208518,
"ActivityID": "{5D29117E-4827-4F9B-93BB-6CC917ECEB45}",
"ProcessID": 920,
"ThreadID": 111444,
"Channel": "Microsoft-Windows-TaskScheduler/Operational",
"Domain": "NT AUTHORITY",
"AccountName": "SYSTEM",
"UserID": "SYSTEM",
"AccountType": "User",
"Category": "Launch request ignored, instance already running",
"Opcode": "Info",
"TaskName": "\\test-nrds",
"TaskInstanceId": "{5D29117E-4827-4F9B-93BB-6CC917ECEB45}",
"EventReceivedTime": "2017-05-11 01:59:45",
"SourceModuleName": "eventlog",
"SourceModuleType": "im_msvistalog",
"message": "Task Scheduler did not launch task \"\\test-nrds\" because instance \"{5D29117E-4827-4F9B-93BB-6CC917ECEB45}\" of the same task is already running.",
"@version": "1",
"@timestamp": "2017-05-11T00:00:11.394Z",
"host": "192.168.67.99",
"type": "eventlog"
}
}
]
}
}



Lets focus on the RecordNumber field. Looking at the mapping (think "schema") for the eventlog type, we can see that this field is mapped as a long:
curl -XGET 'http://localhost:9200/logstash-2017.05. ... g/_mapping'
https://pastebin.com/ygFdPLjE (Line 1078)

Code: Select all
"RecordNumber": {
"type": "long"
},



And I can consequently sort by this value in the GUI:
2017_05_11_11_38_21_Dashboard_Nagios_Log_Server.png
2017_05_11_11_38_07_Dashboard_Nagios_Log_Server.png


Also, this may be of use: https://www.elastic.co/guide/en/logstas ... te-convert
You do not have the required permissions to view the files attached to this post.
Last edited by dwhitfield on Wed Nov 22, 2017 11:34 am, edited 2 times in total.
Reason: added images
bpizzutiWHI
Posts: 64
Joined: Thu Mar 02, 2017 10:15 am

Re: Not recieving logs after 2.0 upgrade

Post by bpizzutiWHI »

It's somewhat useful, though having the screenshots would be more helpful. But it looks like putting a filter in to manyally mutate the ErrorCode field to a string should work as a workaround, at least for now. I just need to figure out a way to get the filter to hit every event and do it...
bpizzutiWHI
Posts: 64
Joined: Thu Mar 02, 2017 10:15 am

Re: Not recieving logs after 2.0 upgrade

Post by bpizzutiWHI »

Ok, put a mutate in, didn't work. I'm guessing this has to be changed before the filter step somewhow.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Not recieving logs after 2.0 upgrade

Post by cdienger »

I believe a change like this will not be apparent until the index is rotated. Leave this in place for now and check tomorrow after it's rotated. If that doesn't resolve it, please provide a sample of the logs that are being imported as well as a copy of the files in /usr/local/nagioslogserver/logstash/etc/conf.d/* and we can test further on our end and advise accordingly.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bpizzutiWHI
Posts: 64
Joined: Thu Mar 02, 2017 10:15 am

Re: Not recieving logs after 2.0 upgrade

Post by bpizzutiWHI »

Umm, yeah, that won't be happening until Monday, you know. Located in the US. :)
Locked