Re: [Nagios-devel] nagios 3 host checks logic problem
Posted: Fri Sep 21, 2007 2:52 am
This is a multipart message in MIME format.
--=_alternative 003BAE0AC125735D_=
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="ISO-8859-1"
Dear List
today I installed the new CVS to get rid of the host check logic problem=20
and the high cpu load.
I can confirm that load is in a normal range now but after installing CVS=
=20
someting with host- and servicechecks went terribly wrong.
Many checks return a critical result even if the checked system or service=
=20
is up. (I compared with a second nagios server running 2.9. Everything was=
=20
ok there.)
I executed the checkcommands manual from the commandline and received=20
correct values and an OK state, while Nagios said it is critical.
Caused by this I switched back to nagios 3.0b3. All services and hosts=20
returned to a normal state but of course CPU load is high again now.
Best regards
Thomas
P Please consider the environmental impact of needlessly printing this=20
e-mail.=20
Ethan Galstad =20
Sent by: [email protected]
20.09.2007 23:20
Please respond to
[email protected]; Please respond to
Nagios Developers List
To
Nagios Developers List
cc
Subject
Re: [Nagios-devel] nagios 3 host checks logic problem on some=20
kernels/distros
Thanks all - I found the cause of the problem and fixed it. A patch=20
will be in CVS shortly.
Thomas Stolle wrote:
>=20
> From: SCHAER Frederic cea.fr>
> Subject: *nagios 3 host checks logic problem on some kernels/distros*=20
> *
> Newsgroups: *=20
> *MailScanner has=20
> detected a possible fraud attempt from "news.gmane.org" claiming to be*=
=20
> *gmane.network.nagios.devel*=20
> *
> Date: 2007-09-10 16:17:30 GMT (1 week, 15 hours and 23 minutes ago) *
>=20
> *Hi, *
>=20
> * *
>=20
> *I think I identified a problem (but not and the solution) on the nagios=
=20
> 3 source tree? *
>=20
> *I tried with both the 3.0b3 and cvs HEAD source files and could not get=
=20
> rid of the problem. *
>=20
> *I?m running a 2.4.21 kernel on a RHEL3 box. *
>=20
> * *
>=20
> *What happens is that as soon as I start nagios 3, it starts eating all=
=20
> of the *CPU*. *
>=20
[snip]
>=20
> *I have 53 hosts defined, I don?t understand why nagios is checking ever=
=20
> and ever the same host? and why this is not happening on all systems. *
>=20
> * *
>=20
> *De-activating host checks magically ?solves? the problem. *
>=20
> * *
>=20
> *I just found out that commenting hosts ?check_command? caused this=20
> behaviour (with host_checks_enabled=3Dtrue), and that defining a correct=
=20
> check_command prevented nagios from being so *CPU* hungry? *
>=20
> * *
>=20
> *Hope I helped? *
>=20
> * *
>=20
> *Cheers *
>=20
>=20
>=20
> Dear List,
>=20
> I can confirm the problem Frederic reported.
> I am using Nagios 3.0b3 on CentOS 4.4
> After starting nagios, the process catches nearly 100 % CPU (See=20
> top-output below)
> Disableing hostchecks let the process return to normal values.
> As far as I can remember, the problem did not occour with nagios3.0a=20
> (but I can not verify at the moment)
>=20
> Tasks: 89 total, 3 running, 86 sleeping, 0 stopped, 0 zombie
> Cpu(s): 26.0% us, 1.3% sy, 0.0% ni, 72.6% id, 0.0% wa, 0.1% hi,=20
> 0.0% si
> Mem: 4041580k total, 1373844k used, 2667736k free, 60200k buffers
> Swap: 4192956k total, 0k used, 4192956k free, 1137348k cached
>=20
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 28617 nagios 25 0 29756 10m 1056 R 96 0.3 17:12.48 nagios
> 1 root 16 0 4752 552 460 S 0 0.0 0:02.75 init
> 2 root RT 0 0 0 0 S 0 0.0 0:00.04 migration/0
>=20
>=20
> Th
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
--=_alternative 003BAE0AC125735D_=
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="ISO-8859-1"
Dear List
today I installed the new CVS to get rid of the host check logic problem=20
and the high cpu load.
I can confirm that load is in a normal range now but after installing CVS=
=20
someting with host- and servicechecks went terribly wrong.
Many checks return a critical result even if the checked system or service=
=20
is up. (I compared with a second nagios server running 2.9. Everything was=
=20
ok there.)
I executed the checkcommands manual from the commandline and received=20
correct values and an OK state, while Nagios said it is critical.
Caused by this I switched back to nagios 3.0b3. All services and hosts=20
returned to a normal state but of course CPU load is high again now.
Best regards
Thomas
P Please consider the environmental impact of needlessly printing this=20
e-mail.=20
Ethan Galstad =20
Sent by: [email protected]
20.09.2007 23:20
Please respond to
[email protected]; Please respond to
Nagios Developers List
To
Nagios Developers List
cc
Subject
Re: [Nagios-devel] nagios 3 host checks logic problem on some=20
kernels/distros
Thanks all - I found the cause of the problem and fixed it. A patch=20
will be in CVS shortly.
Thomas Stolle wrote:
>=20
> From: SCHAER Frederic cea.fr>
> Subject: *nagios 3 host checks logic problem on some kernels/distros*=20
> *
> Newsgroups: *=20
> *MailScanner has=20
> detected a possible fraud attempt from "news.gmane.org" claiming to be*=
=20
> *gmane.network.nagios.devel*=20
> *
> Date: 2007-09-10 16:17:30 GMT (1 week, 15 hours and 23 minutes ago) *
>=20
> *Hi, *
>=20
> * *
>=20
> *I think I identified a problem (but not and the solution) on the nagios=
=20
> 3 source tree? *
>=20
> *I tried with both the 3.0b3 and cvs HEAD source files and could not get=
=20
> rid of the problem. *
>=20
> *I?m running a 2.4.21 kernel on a RHEL3 box. *
>=20
> * *
>=20
> *What happens is that as soon as I start nagios 3, it starts eating all=
=20
> of the *CPU*. *
>=20
[snip]
>=20
> *I have 53 hosts defined, I don?t understand why nagios is checking ever=
=20
> and ever the same host? and why this is not happening on all systems. *
>=20
> * *
>=20
> *De-activating host checks magically ?solves? the problem. *
>=20
> * *
>=20
> *I just found out that commenting hosts ?check_command? caused this=20
> behaviour (with host_checks_enabled=3Dtrue), and that defining a correct=
=20
> check_command prevented nagios from being so *CPU* hungry? *
>=20
> * *
>=20
> *Hope I helped? *
>=20
> * *
>=20
> *Cheers *
>=20
>=20
>=20
> Dear List,
>=20
> I can confirm the problem Frederic reported.
> I am using Nagios 3.0b3 on CentOS 4.4
> After starting nagios, the process catches nearly 100 % CPU (See=20
> top-output below)
> Disableing hostchecks let the process return to normal values.
> As far as I can remember, the problem did not occour with nagios3.0a=20
> (but I can not verify at the moment)
>=20
> Tasks: 89 total, 3 running, 86 sleeping, 0 stopped, 0 zombie
> Cpu(s): 26.0% us, 1.3% sy, 0.0% ni, 72.6% id, 0.0% wa, 0.1% hi,=20
> 0.0% si
> Mem: 4041580k total, 1373844k used, 2667736k free, 60200k buffers
> Swap: 4192956k total, 0k used, 4192956k free, 1137348k cached
>=20
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 28617 nagios 25 0 29756 10m 1056 R 96 0.3 17:12.48 nagios
> 1 root 16 0 4752 552 460 S 0 0.0 0:02.75 init
> 2 root RT 0 0 0 0 S 0 0.0 0:00.04 migration/0
>=20
>=20
> Th
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]