[Nagios-devel] Passive host down result is interpreted as up on
Posted: Fri Mar 16, 2007 10:02 am
--Apple-Mail-14-339921410
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed
Hi!
I was wondering if anyone has seen this before. On a slave, we have a
host that is marked as DOWN with a plugin output of "CRITICAL -
Plugin timed out after 10 seconds", as expected. However, on the
master, that host is marked as UP with the same text.
The logs on the master server, show:
[1174045717] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;host1;0;PING
OK - Packet loss = 0%, RTA = 0.37 ms|
Host is marked as UP. Later on:
[1174045949] EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;host1;1;CRITICAL - Plugin timed out after
10 seconds|
Failure arrives.
[1174045949] HOST ALERT: host1;DOWN;HARD;1;CRITICAL - Plugin timed
out after 10 seconds
Marked it as DOWN with alert. As expected.
[1174045951] Warning: The results of service '/ - partition' on host
'host1' are stale by 24 seconds (threshold=82 seconds). I'm forcing
an immediate check of the service.
[1174045953] SERVICE ALERT: host1;/ - partition;UNKNOWN;HARD;
1;UNKNOWN: Service results are stale
[1174045959] EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;host1;1;CRITICAL - Plugin timed out after
10 seconds|
More passive results
[1174045971] EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;host1;1;CRITICAL - Plugin timed out after
10 seconds|
And again, but this time...
[1174045973] HOST ALERT: host1;UP;HARD;1;CRITICAL - Plugin timed out
after 10 seconds
Nagios has marked the host as UP, even though the
PROCESS_HOST_CHECK_RESULT is down.
The complete nagios.log around this period is attached. I'm at a lost
understanding why this has happened. Has anyone got any clues, or
seen something similar?
We haven't been able to reproduce this consistently yet.
This is on Nagios 2.5 (with some local patches).
Ton
http://www.altinity.com
T: +44 (0)870 787 9243
F: +44 (0)845 280 1725
Skype: tonvoon
--Apple-Mail-14-339921410
Content-Transfer-Encoding: 7bit
Content-Type: application/octet-stream;
x-unix-mode=0644;
name=nagios.log
Content-Disposition: attachment;
filename=nagios.log
[1174045915] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;host2;1;CRITICAL - Plugin timed out after 10 seconds|
[1174045915] HOST ALERT: host2;DOWN;HARD;1;CRITICAL - Plugin timed out after 10 seconds
[1174045925] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;host2;1;CRITICAL - Plugin timed out after 10 seconds|
[1174045936] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;test1;0;PING OK - Packet loss = 0%, RTA = 0.33 ms|
[1174045937] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;host2;1;CRITICAL - Plugin timed out after 10 seconds|
[1174045939] SERVICE ALERT: host2;/ - partition;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1174045949] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;host1;1;CRITICAL - Plugin timed out after 10 seconds|
[1174045949] HOST ALERT: host1;DOWN;HARD;1;CRITICAL - Plugin timed out after 10 seconds
[1174045951] Warning: The results of service '/ - partition' on host 'host1' are stale by 24 seconds (threshold=82 seconds). I'm forcing an immediate check of the service.
[1174045951] Warning: The results of service 'Solaris syslog events' on host 'host2' are stale by 27 seconds (threshold=79 seconds). I'm forcing an immediate check of the service.
[1174045951] Warning: The results of service 'host2 - crs' on host 'host2' are stale by 27 seconds (threshold=79 seconds). I'm forcing an immediate check of the service.
[1174045953] SERVICE ALERT: host1;/ - partition;UNKNOWN;HARD;1;UNKNOWN: Service results are stale
[1174045953] SERVICE ALERT: host2;Solaris syslog events;UNKNOWN;HARD;1;UNKNOWN: Service results are stale
[1174045953] SERVICE ALERT: host2;host2 - crs;UNKNOWN;HARD;1;UNKNOWN: Service results are stale
[1174045959] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;host1;1;CRITICAL - Plugin timed out after 10 seconds|
[1174045970] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;test1;0;PING OK - Packet loss = 0%, RTA = 0.33 ms|
[1174045971] EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;host1;1;CRITICAL - Plugin timed out after 10 seconds|
[11
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]