Received an event that has a different character encoding

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
ssoliveira
Posts: 91
Joined: Wed Dec 07, 2016 6:02 pm

Re: Received an event that has a different character encodin

Post by ssoliveira »

Hi all,

I believe we are looking at the problem in the wrong way.


The default of JSON is to be UTF-8.
So if I remove the codec it will use UTF-8.

However, NXLOG is not UTF-8.

https://www.elastic.co/guide/en/logstas ... -json.html

Text extracted from the link above.

====================================================================================

Default value is "UTF-8"

The character encoding used in this codec. Examples include "UTF-8" and "CP1252".

JSON requires valid UTF-8 strings, but in some cases, software that emits JSON
does so in another encoding (nxlog, for example). In weird cases like this,
you can set the charset setting to the actual encoding of the text and Logstash
will convert it for you.

For nxlog users, you may to set this to "CP1252".

====================================================================================

The problem I face is that there are some specific lines in the IIS logs;
Which contains characters that are experiencing problems when being processed by LOGSTASH.

As highlighted in the attached image.

I am searching if I can convert the lines "$raw_event" in nxlog to UTF-8;
To remove any problematic characters, Before sending the data to LOGSTASH

There are many people facing this problem (from what I found on google);
I believe that solving this problem will help a lot of people.

https://discuss.elastic.co/t/how-to-han ... tf-8/25294
You do not have the required permissions to view the files attached to this post.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Received an event that has a different character encodin

Post by mcapra »

Stripping out the problematic characters is one solution. This can be done in nxlog prior to shipping it to Logstash, as you've indicated. The other option is to properly define, within the Logstash rule, what charset the Windows machine is using. CP1252 and CP437 are remarkably similar (just as one example), but there are differences that will trip up Logstash.
Former Nagios employee
https://www.mcapra.com/
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Received an event that has a different character encodin

Post by dwhitfield »

Thanks @mcapra!

OP, did you have any other questions?
ssoliveira
Posts: 91
Joined: Wed Dec 07, 2016 6:02 pm

Re: Received an event that has a different character encodin

Post by ssoliveira »

Problem solved.

I am converting the ANSI characters to UTF-8

<Extension charconv>
Module xm_charconv
AutodetectCharsets utf-8, iso8859-1
</Extension>

<Input iisw3c>
Module im_file
File "C:\Inetpub\Logs\LogFiles\*ex*.log"
SavePos TRUE
Recursive TRUE
InputType LineBased
Exec if file_name() !~ /W3SVC/ drop();
Exec if $raw_event =~ /^#/ drop();
Exec if $raw_event =~ /^\xEF\xBB\xBF#/ drop();
Exec $raw_event = convert($raw_event, "iso8859-1", "utf-8");

Exec ExtIISW3C->parse_csv();
Exec $FileName = file_name();
Exec $IIsTime = parsedate($date + " " + $time);
Exec $Message = $cs_method + " " + $cs_uri_stem;
Exec delete($SourceModuleType);
</Input>

You can close this ticket.
Locked