Rsyslog: Abandoned Spool Files
Posted: Thu Jul 30, 2015 9:59 am
Hi,
This is a pure rsyslog question, but maybe you ran into this problem before.
I have a disk-assisted queue configured that appears to generally be working pretty well. A connection interrupts, a .qi file is created, and then numbered (.00000001,.00000002,.00000003...) spool files are created as space is needed. We reconnect and the spool files are processed and removed from the $WorkDirectory with the exception of the .qi file (which remains permanently by design and this isn't a problem) and the last spool file (ex: myqueue.00000531) that appears to never have been processed. That is a problem because those are messages that should have been sent to NLS that never made it there. Restarting rsyslog process appears to clean it up - so in a pinch, nightly rsyslog restarts would be a band aid, but I'm hoping to find a better solution.
Red Hat version
Rsyslog - Not the newest version, but one of the later builds from 7-series and latest officially shipped for my flavor of RHEL.
Looks like the qi file is updated after the last spool file is created. (Moved my $WorkDirectory here for partition sizing reasons)
rsyslog config file (names/directories changed for privacy)--
In my Googling, I found the recover_qi.pl script but to my understanding is it rebuilds the .qi file because of a rsyslog bug that used to exist where the qi file goes missing. In my case, the qi file is there and I've tried the script and it didn't do anything for me. I also tried going through the changelogs for future versions of rsyslog (later 7 releases, but also 8 releases) to try to find a hint that this was a fixed bug, and was unsuccessful there too.
So short of a nightly rsyslog reboot, any other ideas?
Thanks.
This is a pure rsyslog question, but maybe you ran into this problem before.
I have a disk-assisted queue configured that appears to generally be working pretty well. A connection interrupts, a .qi file is created, and then numbered (.00000001,.00000002,.00000003...) spool files are created as space is needed. We reconnect and the spool files are processed and removed from the $WorkDirectory with the exception of the .qi file (which remains permanently by design and this isn't a problem) and the last spool file (ex: myqueue.00000531) that appears to never have been processed. That is a problem because those are messages that should have been sent to NLS that never made it there. Restarting rsyslog process appears to clean it up - so in a pinch, nightly rsyslog restarts would be a band aid, but I'm hoping to find a better solution.
Red Hat version
Code: Select all
[root@schtwb03 ~]# cat /etc/redhat-release && uname -rms
Red Hat Enterprise Linux Server release 6.6 (Santiago)
Linux 2.6.32-504.1.3.el6.x86_64 x86_64
Code: Select all
[root@schtwb03 ~]# rsyslogd -version
rsyslogd 7.4.10, compiled with:
FEATURE_REGEXP: Yes
FEATURE_LARGEFILE: No
GSSAPI Kerberos 5 support: Yes
FEATURE_DEBUG (debug build, slow code): No
32bit Atomic operations supported: Yes
64bit Atomic operations supported: Yes
Runtime Instrumentation (slow code): No
uuid support: Yes
Looks like the qi file is updated after the last spool file is created. (Moved my $WorkDirectory here for partition sizing reasons)
Code: Select all
[root@schtwb03 ~]# ls -l /home/logs | grep -v .log
total 884
-rw------- 1 root adm 486012 Jul 29 18:55 iceeumt.00000123
-rw------- 1 root adm 495 Jul 29 18:57 iceeumt.qi
Code: Select all
[root@schtwb03 ~]
$ModLoad imfile
$InputFilePollInterval 10
$PrivDropToGroup adm
$WorkDirectory /home/logs
# Input for ice_eu_mt
$InputFileName /home/log/ice_eu_mt.log
$InputFileTag ice_eu_mt:
$InputFileStateFile nls-state-home_log_ice_eu_mt.log # Must be unique for each file being polled
# Uncomment the folowing line to override the default severity for messages
# from this file.
#$InputFileSeverity info
$InputFilePersistStateInterval 20000
$InputRunFileMonitor
# Forward to Nagios Log Server and then discard, otherwise these messages
# will end up in the syslog file (/var/log/messages) unless there are other
# overriding rules.
#Buffer Settings
$ActionResumeInterval 10
$ActionQueueSize 100000
$ActionQueueDiscardMark 97500
$ActionQueueHighWaterMark 80000
$ActionQueueType LinkedList
$ActionQueueFileName iceeumt
$ActionQueueCheckpointInterval 100
$ActionQueueMaxDiskSpace 500m
$ActionResumeRetryCount -1
$ActionQueueSaveOnShutdown on
$ActionQueueTimeoutEnqueue 0
$ActionQueueDiscardSeverity 0
if $programname == 'ice_eu_mt' then @@nls:5544
if $programname == 'ice_eu_mt' then ~
So short of a nightly rsyslog reboot, any other ideas?
Thanks.