--=_alternative 00474183C1257483_=
Content-Type: text/plain; charset="utf-8"
content-transfer-encoding: quoted-printable
[email protected] schrieb am 09.07.2008 12:04:01:
> after debugging some retry problems, I stumbled over strange entries
> in the showlog.cgi (see attached=20
> screenshot). This is the daily log. At 00:00 it rotates and logs all
> initial states of the machines/services -=20
> even though I disabled it in the nagios.cfg. Though I do not really=20
> mind that. The strange thing are the=20
> empty entries of 01.01.1970 _after_ the initial states, then some=20
> current entries, then again 01.01.1970, then=20
> again current states, and then nagios logs ALL the initial states=20
_again_.=20
>=20
> Not that it's causing any obvious problems, but it looks kinda=20
> strange in my opinion.=20
> Does anyone have an idea what may cause this behaviour?=20
Self-replies rock.
I'll CC this to nagios-devel since it's a bug. I'm not yet sure
where it actually comes from, but I guess it's inside process_macros().
Still trying to pin it down exactly...
Back to my initial problem (see above).
After some debugging I found the culprit. It's my service definitions
containing quoted $ signs. Nagios stumbles over them and logs nonsense
in the current state logging after log rotating and everytime else when
it processes macros for this server (see below).
This is an excerpt from nagios.log (notice the lines with single $ on=20
them):
[1215727200] CURRENT SERVICE STATE: esskhk01;MSSQLSERVER;OK;HARD;1;OK:=20
MSSQLSERVER: started
[1215727200] CURRENT SERVICE STATE: esskhk01;MSSQL_PERBIT;OK;HARD;1;OK:=20
MSSQL$PERBIT: started
$
[1215727200] CURRENT SERVICE STATE: esskhk01;MSSQL_PERSONAL;OK;HARD;1;OK:=20
MSSQL$PERSONAL: started
$
[1215727200] CURRENT SERVICE STATE: esskhk01;NSClient_Version;OK;HARD;1;I=20
(0.3.0.4 2007-12-04) seem to be doing fine...
[1215727200] CURRENT SERVICE STATE: esskhk01;SQLAgent_PERBIT;OK;HARD;1;OK:=
=20
SQLAgent$PERBIT: started
$
[1215727200] CURRENT SERVICE STATE:=20
esskhk01;SQLAgent_PERSONAL;OK;HARD;1;OK: SQLAgent$PERSONAL: started
$
[1215727200] CURRENT SERVICE STATE: esskhk01;SQLSERVERAGENT;OK;HARD;1;OK:=20
SQLSERVERAGENT: started
Those "$" lines show up as "[01-01-1970 01:00:00] " with empty info on=20
them in the showlog.cgi (see below).
My service definition for example:
define service {
service_description MSSQL_PERBIT
use generic-service-noperf
host_name esskhk01
check_command check_nrpe_services!MSSQL"$$"PERBIT
}
Debugging shows (debug_level=3D-1, debug_verbosity=3D2):
Check gets scheduled and nagios parses the returnvalue. Obviously there is=
=20
a \n that does not belong there.
Not yet sure why it shows up there...
[1215777431.127292] [001.0] [pid=3D3539] handle_async_service_check_result()
[1215777431.127299] [016.0] [pid=3D3539] ** Handling check result for=20
service 'MSSQL_PERBIT' on host 'esskhk01'...
[1215777431.127305] [016.1] [pid=3D3539] HOST: esskhk01, SERVICE:=20
MSSQL_PERBIT, CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHE
DULE: Yes, EXITED OK: Yes, RETURN CODE: 3, OUTPUT: Service: MSSQL$PERBI=20
caused: GetServiceKeyName: Could not translate service name(
997)\n
[1215777431.127320] [016.1] [pid=3D3539] Service is in a non-OK state!
- Shortly afterwards the macros for the alert get processed.
- The fprintf parses the \n and the log shows a newline now.
- Afterwards process_macros() parses the string and tries to resolve=20
everything after the $ in the service
definition.
- It fails to do so for obvious reasons, but then it seems to add an extra=
=20
$ at the end of the buffer,
maybe while trying to fix up a "broken macro"?
- Because the input buffer does not seem to be stripped yet, it adds an=20
extra $ _after_ the \n
- strip(buffer) doesn't work anymore, as it only strips \n and " " when=20
trailing and heading.
That's why I see single "$" entries in my logfile - which the showlog.cgi=20
wrong parses as
"[01-01-1970
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]