Re: [Nagios-devel] SEGV in 2.0b2 (FreeBSD 4.10/200 hosts/330 active/300 passive) - repeatedly after 2-7 days running.

Guest · Post by **Guest** » Sun Apr 03, 2005 8:02 pm

Thanks for the note Stanley. If you can manage to get a core file or
track the problem down further, let me know. I'm releasing 2.0b3
tonight, so this won't probably be fixed until 2.0b4.

On 2 Apr 2005 at 19:53, Stanley Hopcroft wrote:

> Dear Folks,
>
> I am writing to report what may be a problem with Nag 2.0b2 (embedded
> Perl, pthread lib, FreeBSD 4.10).
>
> Nagios runs no more than 10 days before dieing with a SEGV.
>
> Like a former report of SEGVs ('coredumps in wobbly
> networks'/Ericsson/24 Mar 2005) there _may_ be a pattern in the logged
> messages before the SEGV.
>
> Exitting from scheduled downtime appears to be a health hazard.
>
> In the last case,
>
> Sat Apr 02 17:05:42 SERVICE DOWNTIME ALERT:
> foo:bar via the blurfl provider
> infrastructure;STOPPED; Service has exited from a period of scheduled
> downtime Sat Apr 02 17:06:18 Auto-save of retention data completed
> successfully.
>
> Sat Apr 02 18:07:33 Nagios 2.0b2 starting... (PID=97771)
>
> tsitc> grep nagios /var/log/messages
> Apr 2 17:07:52 tsitc /kernel: pid 3400 (nagios), uid 1000: exited on
> signal 11
>
> And the one before,
>
> Tue Mar 29 06:20:58 SERVICE ALERT: nada;TEC CPU;WARNING;HARD;1;The
> percentage of CPU in idle state is low. This indicates high CPU
> overload. date: 03/29/2005 06:20:50 AM eventid: 1112041070 557
> modelname: DMXCpu name: total percidlecpu: 0 profilename:
> ITM.OS.Unix_Dev_Monitoring.itm#IPAustralia-region source: TMNT status:
> OPEN
>
> Tue Mar 29 06:30:44 SERVICE DOWNTIME ALERT: yada;Standard host-centric
> checks;STOPPED; Service has exited from a period of scheduled downtime
>
> Tue Mar 29 06:30:44 SERVICE DOWNTIME ALERT: wurfl;COMS ad-hoc
> check;STOPPED; Service has exited from a period of scheduled downtime
>
> Tue Mar 29 06:30:44 HOST DOWNTIME ALERT: yada;STOPPED; Host has exited
> from a period of scheduled downtime Tue Mar 29 09:11:27 Nagios 2.0b2
> starting... (PID=5473)
>
> tsitc> grep nagios /var/log/messages
> Mar 29 06:30:44 tsitc /kernel: pid 31467 (nagios), uid 1000: exited on
> signal 11
>
> Obviously it is easy to check whether scheduling downtime is causal; I
> will give it a go and watch.
>
> No core file.
>
> Yours sincerely.
>
> --
> Stanley Hopcroft
>
> IP Australia
> Ph: (02) 6283 3189 Fax: (02) 6281 1353
> PO Box 200 Woden ACT 2606
> http://www.ipaustralia.gov.au
>

Ethan Galstad,
Nagios Developer
---
Email: [email protected]
Website: http://www.nagios.org

This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]