Re: Received an event that has a different character encodin
Posted: Tue Aug 15, 2017 12:35 pm
Hi all,
I believe we are looking at the problem in the wrong way.
The default of JSON is to be UTF-8.
So if I remove the codec it will use UTF-8.
However, NXLOG is not UTF-8.
https://www.elastic.co/guide/en/logstas ... -json.html
Text extracted from the link above.
====================================================================================
Default value is "UTF-8"
The character encoding used in this codec. Examples include "UTF-8" and "CP1252".
JSON requires valid UTF-8 strings, but in some cases, software that emits JSON
does so in another encoding (nxlog, for example). In weird cases like this,
you can set the charset setting to the actual encoding of the text and Logstash
will convert it for you.
For nxlog users, you may to set this to "CP1252".
====================================================================================
The problem I face is that there are some specific lines in the IIS logs;
Which contains characters that are experiencing problems when being processed by LOGSTASH.
As highlighted in the attached image.
I am searching if I can convert the lines "$raw_event" in nxlog to UTF-8;
To remove any problematic characters, Before sending the data to LOGSTASH
There are many people facing this problem (from what I found on google);
I believe that solving this problem will help a lot of people.
https://discuss.elastic.co/t/how-to-han ... tf-8/25294
I believe we are looking at the problem in the wrong way.
The default of JSON is to be UTF-8.
So if I remove the codec it will use UTF-8.
However, NXLOG is not UTF-8.
https://www.elastic.co/guide/en/logstas ... -json.html
Text extracted from the link above.
====================================================================================
Default value is "UTF-8"
The character encoding used in this codec. Examples include "UTF-8" and "CP1252".
JSON requires valid UTF-8 strings, but in some cases, software that emits JSON
does so in another encoding (nxlog, for example). In weird cases like this,
you can set the charset setting to the actual encoding of the text and Logstash
will convert it for you.
For nxlog users, you may to set this to "CP1252".
====================================================================================
The problem I face is that there are some specific lines in the IIS logs;
Which contains characters that are experiencing problems when being processed by LOGSTASH.
As highlighted in the attached image.
I am searching if I can convert the lines "$raw_event" in nxlog to UTF-8;
To remove any problematic characters, Before sending the data to LOGSTASH
There are many people facing this problem (from what I found on google);
I believe that solving this problem will help a lot of people.
https://discuss.elastic.co/t/how-to-han ... tf-8/25294