Re: [Nagios-devel] SIGXFSZ causes nagios to exit silently with

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] SIGXFSZ causes nagios to exit silently with

Post by Guest »

John Rouillard wrote:
> Hi all:
>
> I am seeing the top level nagios daemon exiting shortly after startup
> (after it's first few scheduled service checks are started). When it
> exits it doesn't log anything or does it clear out the status files to
> indicate to the web interface that it has exited.
>
> When run under gdb I see:
>
> Program received signal SIGXFSZ, File size limit exceeded.
> (gdb) where
> #0 0x0060a7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1 0x006dc11b in __write_nocancel () from /lib/tls/libc.so.6
> #2 0x0068109f in _IO_new_file_write () from /lib/tls/libc.so.6
> #3 0x0067fafb in _IO_new_do_write () from /lib/tls/libc.so.6
> #4 0x006807a2 in _IO_new_file_sync () from /lib/tls/libc.so.6
> #5 0x00675af2 in fflush () from /lib/tls/libc.so.6
> #6 0x0808f8d9 in xpddefault_update_service_performance_data_file (
> svc=0x9da19d0) at ../xdata/xpddefault.c:677
> #7 0x0808f8fc in xpddefault_update_service_performance_data (svc=0x9da19d0)
> at ../xdata/xpddefault.c:403
> #8 0x0808e8a1 in update_service_performance_data (svc=0x9da19d0)
> at perfdata.c:91
> #9 0x08057b78 in reap_service_checks () at checks.c:1415
> #10 0x08063790 in handle_timed_event (event=0x9a41ca0) at events.c:1255
> #11 0x08063e51 in event_execution_loop () at events.c:966
> #12 0x08053ad5 in main (argc=2, argv=0xbfeead04) at nagios.c:715
>
> Now I am hitting the 2GB limit on the service perfdata file:
>
> [rouilj@ops01 ~]$ ls -lh /var/spool/nagios/tmp/service-perfdata
> -rw-rw-r-- 1 nagios nagios 2.0G Jun 2 09:21 /var/spool/nagios/tmp/service-perfdata
>
> (exact size 2147483647 bytes). The file size ulimit on the process is
> unlimited.
> [rouilj@ops01 ~]$ ulimit -a
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> file size (blocks, -f) unlimited
> pending signals (-i) 1024
> max locked memory (kbytes, -l) 32
> max memory size (kbytes, -m) unlimited
> open files (-n) 1024
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> stack size (kbytes, -s) 10240
> cpu time (seconds, -t) unlimited
> max user processes (-u) 73728
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
>
> It's a 32 bit kernel i686. uname -a reports:
>
> Linux ops01.renesys.com 2.6.9-42.0.10.ELsmp #1 SMP Tue Feb 27 10:11:19
> EST 2007 i686 i686 i386 GNU/Linux
>
> I think nagios can handle this case better by:
>
> 1) Trapping the SIGXFSZ signal so it doesn't exit
> 2) Log an error to nagios.log
> 3) (schedule a) close and reopen of host_perfdata_file and
> service_perfdata_file allowing the user to rotate the file on command,
> or re-enable perfdata logging by moving the files aside and
> having nagios recreate the files.
>
> 3 is kind of a hack, but there is no signal currently that closes and
> reopens the output files (host_perfdata_file, service_perfdata_file)
> without resetting all of the nagios daemon's internal state. With 3
> implemented, it is possible to rotate these files without resetting
> nagios's internal state (current scheduled services queue for example)
> on user demand.
>
> Alternatively the log rotation mechanism currently available for the
> main log file (nagios.log) could be extended to automatically rotate
> and archive these files. I would be happy where all the files were
> rotated/archived on the same schedule as the main log file, but people
> will probably want the following options in nagios.cfg:
>
> host_perfdata_rotation_method, service_perfdata_rotation_method:
> no rotation, hourly, daily, weekly, monthly.
>
> host_perfdata_archive_path, service_perfdata_archive_path:
> move host_perfdata_file, service_perfdata_file to the archive
> directory with a timestamped extension similar to nagios log file.
>
> Now this does bring up an interesting question, does anybody have

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked