Page 1 of 1

[Nagios-devel] Possible Bugs in Nagios configfile parsing

Posted: Wed Oct 20, 2004 4:36 am
by Guest
This is a multipart message in MIME format.
--=_alternative 00452829C1256F33_=
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Greetings,

I just recently started migrating to SMS notifications and therefor=20
testing escalations.
I discovered several possible bugs in the configfiles parsing.

I am using a CVS snapshot of nagios 2.0 dated from 08.10.2004.
config.c hasn't been modified since then according to the CVS.

Under certain circumstances nagios ends up in an endless loop in=20
pre=5Fflight=5Fcheck().

Everything is running fine, unless I put any kind of escalation into the=20
config files.

example:

define hostescalation {
host=5Fname PDC01
first=5Fnotification 1
last=5Fnotification 1
notification=5Finterval 60
contact=5Fgroups SMS-Alarm
}

Now running nagios -v nagios.cfg:

[root@SRV00032 etc]# ../bin/nagioscheck

Nagios 2.0a1
Copyright (c) 1999-2004 Ethan Galstad ([email protected])
Last Modified: 11-18-2003
License: GPL

Reading configuration data...

Running pre-flight check on configuration data...

Checking services...
Checked 253 services.
Checking hosts...
Warning: Host 'ABIT-DMZ=5Fswitch' has no services associated with it!
Warning: Host 'RECHT.NET-DMZ=5Fswitch' has no services associated with it!
Checked 106 hosts.
Checking host groups...
Checked 35 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 9 contacts.
Checking contact groups...
Checked 7 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...

Then nagios hangs with 99.9% cpu load.

There is another interesting anomaly when I tried using escalations.

I had a contactgroup that was unused called SMS-Test. When that=20
contactgroup
was activated nagios -v nagios.cfg outputs:

[root@SRV00032 etc]# ../bin/nagioscheck

Nagios 2.0a1
Copyright (c) 1999-2004 Ethan Galstad ([email protected])
Last Modified: 11-18-2003
License: GPL

Reading configuration data...

Running pre-flight check on configuration data...

Checking services...
Checked 253 services.
Checking hosts...
Warning: Host 'ABIT-DMZ=5Fswitch' has no services associated with it!
Warning: Host 'RECHT.NET-DMZ=5Fswitch' has no services associated with it!
Checked 106 hosts.
Checking host groups...
Checked 35 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 9 contacts.
Checking contact groups...

And nagios hangs again with 99.9% cpu load.
It didn't even get to checking the host escalations, for some reason it=20
already
hangs in the contactgroups.cfg.

This leads me to the conclusion that checking the references relating to=20
contacts does
have an error and can lead to a possible endless loop in=20
pre=5Fflight=5Fcheck(). I took a
quick look into config.c, but the problem didn't strike me yet.
The code is quite... strange ;-)

The problem is that I don't believe it doesn't work for anyone, because I=20
never seen
anyone mention it. Therefor some kind of circumstance I have must be=20
provoking this
problem.

I'll try to put some more debugging output into config.c so I can see=20
where exactly it hangs,
I'm not in the mood for exhaustive gdb sessions...

Since I am using cfg=5Fdir directives in nagios.cfg for single cfg-files fo=
r=20
each host, it's kinda
complicated to post those to the list. Especially because publishing those =

to the public
exposes all critical information for those systems acoording to internal=20
IPs, services and
purposes. And I don't feel like editing hundreds of files...

Thanks for reading that far ;)

sash

--------------------------------------------------
Sascha Runschke
Netzwerk Administration
IT-Services

ABIT AG
Robert-Bosch-Str. 1
40668 Meerbusch

Tel.:+49 (0) 2150.9153.226
mailto:[email protected]

http://www.abit.net
http://www.abit-epos.net
http://www.my-academy.net
--------------------------------------------------
Der Inhalt dieser Email sowie die Anh=E4nge sind

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]