nxlog: bug with SavePos on files > 4GB?

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
_asp_
Posts: 91
Joined: Mon May 23, 2016 4:30 am

nxlog: bug with SavePos on files > 4GB?

Post by _asp_ »

Hi,

I have some logfiles which will be 5 to 6 GB at the end of day.
When I restarted nxlog in the evening, one log was not resumed correctly. It has been reloaded from 7 o clock. Size at 7 o clock may be at about 1GB.
So I needed to delete the cache to continue normally.

I assume that there is an integer overflow, so that 5GB lead to 1GB position.

Is this a known bug? Can you confirm it?

Thanks, Andreas
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: nxlog: bug with SavePos on files > 4GB?

Post by mcapra »

Which module is being used? im_file? I could take a look at the source, though nxlog isn't our product.
Former Nagios employee
https://www.mcapra.com/
_asp_
Posts: 91
Joined: Mon May 23, 2016 4:30 am

Re: nxlog: bug with SavePos on files > 4GB?

Post by _asp_ »

yes, exactly. Thanks
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: nxlog: bug with SavePos on files > 4GB?

Post by dwhitfield »

It *does* look like int overflow, but it is going to take us some time to be more confident in the answer, since that is not our product.

Probably the best thing to do is post at https://nxlog.co/community-forum/im_file and then they can point you to the appropriate way to file a bug.

For the time being, you can have nxlog truncate a file after it's processed. It's generally not the most efficient use of nxlog, but in your case it seems warranted.

Please let us know if you are unable to get a response at nxlog.co and we can see what we can find out. We'll leave this thread open. Thank you for your patience.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: nxlog: bug with SavePos on files > 4GB?

Post by mcapra »

So the way that nxlog's im_file module is evaluating the position in a file leverages the following function in the Apache Portable Runtime:
http://apr.apache.org/docs/apr/1.5/grou ... 1a6aff63a1

Which I doubt is the culprit here. However, the way nxlog stores the previous offset is done like so:

Code: Select all

if ( imconf->savepos == TRUE )
	{
	    nx_config_cache_set_int(module->name, file->name, (int) file->filepos);
	    log_debug("module %s saved position %ld for %s",
		      module->name, (long int) file->filepos, file->name);
	}
Where only the int value of filepos is being referenced within the internal config set. That's likely where (if anywhere) the overflow is happening. When writing the config set, they should probably cast it as an apr_off_t if possible or at least a long long. The im_file module seems to be using apr_off_t for the type when it references this value internally, just not when it's written to the config.
Former Nagios employee
https://www.mcapra.com/
Locked