Hi all,
I believe we are looking at the problem in the wrong way.
The default of JSON is to be UTF-8.
So if I remove the codec it will use UTF-8.
However, NXLOG is not UTF-8.
https://www.elastic.co/guide/en/logstas ... -json.html
Text extracted from the link above.
====================================================================================
Default value is "UTF-8"
The character encoding used in this codec. Examples include "UTF-8" and "CP1252".
JSON requires valid UTF-8 strings, but in some cases, software that emits JSON
does so in another encoding (nxlog, for example). In weird cases like this,
you can set the charset setting to the actual encoding of the text and Logstash
will convert it for you.
For nxlog users, you may to set this to "CP1252".
====================================================================================
The problem I face is that there are some specific lines in the IIS logs;
Which contains characters that are experiencing problems when being processed by LOGSTASH.
As highlighted in the attached image.
I am searching if I can convert the lines "$raw_event" in nxlog to UTF-8;
To remove any problematic characters, Before sending the data to LOGSTASH
There are many people facing this problem (from what I found on google);
I believe that solving this problem will help a lot of people.
https://discuss.elastic.co/t/how-to-han ... tf-8/25294
Received an event that has a different character encoding
-
ssoliveira
- Posts: 91
- Joined: Wed Dec 07, 2016 6:02 pm
Re: Received an event that has a different character encodin
You do not have the required permissions to view the files attached to this post.
Re: Received an event that has a different character encodin
Stripping out the problematic characters is one solution. This can be done in nxlog prior to shipping it to Logstash, as you've indicated. The other option is to properly define, within the Logstash rule, what charset the Windows machine is using. CP1252 and CP437 are remarkably similar (just as one example), but there are differences that will trip up Logstash.
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
-
ssoliveira
- Posts: 91
- Joined: Wed Dec 07, 2016 6:02 pm
Re: Received an event that has a different character encodin
Problem solved.
I am converting the ANSI characters to UTF-8
<Extension charconv>
Module xm_charconv
AutodetectCharsets utf-8, iso8859-1
</Extension>
<Input iisw3c>
Module im_file
File "C:\Inetpub\Logs\LogFiles\*ex*.log"
SavePos TRUE
Recursive TRUE
InputType LineBased
Exec if file_name() !~ /W3SVC/ drop();
Exec if $raw_event =~ /^#/ drop();
Exec if $raw_event =~ /^\xEF\xBB\xBF#/ drop();
Exec $raw_event = convert($raw_event, "iso8859-1", "utf-8");
Exec ExtIISW3C->parse_csv();
Exec $FileName = file_name();
Exec $IIsTime = parsedate($date + " " + $time);
Exec $Message = $cs_method + " " + $cs_uri_stem;
Exec delete($SourceModuleType);
</Input>
You can close this ticket.
I am converting the ANSI characters to UTF-8
<Extension charconv>
Module xm_charconv
AutodetectCharsets utf-8, iso8859-1
</Extension>
<Input iisw3c>
Module im_file
File "C:\Inetpub\Logs\LogFiles\*ex*.log"
SavePos TRUE
Recursive TRUE
InputType LineBased
Exec if file_name() !~ /W3SVC/ drop();
Exec if $raw_event =~ /^#/ drop();
Exec if $raw_event =~ /^\xEF\xBB\xBF#/ drop();
Exec $raw_event = convert($raw_event, "iso8859-1", "utf-8");
Exec ExtIISW3C->parse_csv();
Exec $FileName = file_name();
Exec $IIsTime = parsedate($date + " " + $time);
Exec $Message = $cs_method + " " + $cs_uri_stem;
Exec delete($SourceModuleType);
</Input>
You can close this ticket.