Re: [Nagios-devel] SEGV in 2.0b2 (FreeBSD 4.10/200 hosts/330 active/300 passive) - repeatedly after 2-7 days running.

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] SEGV in 2.0b2 (FreeBSD 4.10/200 hosts/330 active/300 passive) - repeatedly after 2-7 days running.

Post by Guest »

Thanks for the note Stanley. If you can manage to get a core file or
track the problem down further, let me know. I'm releasing 2.0b3
tonight, so this won't probably be fixed until 2.0b4.


On 2 Apr 2005 at 19:53, Stanley Hopcroft wrote:

> Dear Folks,
>
> I am writing to report what may be a problem with Nag 2.0b2 (embedded
> Perl, pthread lib, FreeBSD 4.10).
>
> Nagios runs no more than 10 days before dieing with a SEGV.
>
> Like a former report of SEGVs ('coredumps in wobbly
> networks'/Ericsson/24 Mar 2005) there _may_ be a pattern in the logged
> messages before the SEGV.
>
> Exitting from scheduled downtime appears to be a health hazard.
>
> In the last case,
>
> Sat Apr 02 17:05:42 SERVICE DOWNTIME ALERT:
> foo:bar via the blurfl provider
> infrastructure;STOPPED; Service has exited from a period of scheduled
> downtime Sat Apr 02 17:06:18 Auto-save of retention data completed
> successfully.
>
> Sat Apr 02 18:07:33 Nagios 2.0b2 starting... (PID=97771)
>
> tsitc> grep nagios /var/log/messages
> Apr 2 17:07:52 tsitc /kernel: pid 3400 (nagios), uid 1000: exited on
> signal 11
>
> And the one before,
>
> Tue Mar 29 06:20:58 SERVICE ALERT: nada;TEC CPU;WARNING;HARD;1;The
> percentage of CPU in idle state is low. This indicates high CPU
> overload. date: 03/29/2005 06:20:50 AM eventid: 1112041070 557
> modelname: DMXCpu name: total percidlecpu: 0 profilename:
> ITM.OS.Unix_Dev_Monitoring.itm#IPAustralia-region source: TMNT status:
> OPEN
>
> Tue Mar 29 06:30:44 SERVICE DOWNTIME ALERT: yada;Standard host-centric
> checks;STOPPED; Service has exited from a period of scheduled downtime
>
> Tue Mar 29 06:30:44 SERVICE DOWNTIME ALERT: wurfl;COMS ad-hoc
> check;STOPPED; Service has exited from a period of scheduled downtime
>
> Tue Mar 29 06:30:44 HOST DOWNTIME ALERT: yada;STOPPED; Host has exited
> from a period of scheduled downtime Tue Mar 29 09:11:27 Nagios 2.0b2
> starting... (PID=5473)
>
> tsitc> grep nagios /var/log/messages
> Mar 29 06:30:44 tsitc /kernel: pid 31467 (nagios), uid 1000: exited on
> signal 11
>
> Obviously it is easy to check whether scheduling downtime is causal; I
> will give it a go and watch.
>
> No core file.
>
> Yours sincerely.
>
> --
> Stanley Hopcroft
>
> IP Australia
> Ph: (02) 6283 3189 Fax: (02) 6281 1353
> PO Box 200 Woden ACT 2606
> http://www.ipaustralia.gov.au
>



Ethan Galstad,
Nagios Developer
---
Email: [email protected]
Website: http://www.nagios.org






This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked