Page 1 of 1

[Nagios-devel] Bug with --enable-nanosleep?

Posted: Fri Aug 25, 2006 11:05 am
by Guest
This is a multi-part message in MIME format.

------=_NextPart_000_0138_01C6C857.C5B521D0
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit

I've been running a fairly big Nagios setup (600+ checks) for a few years
now... Only issue so far is some lost passive checks under load (I posted
about it some time ago, been dismissed as a non-issue which I think is not).

Some time ago (Aug 16 to be precise) I noticed there were a --enable-nanosleep
option so I tried it to see if it helps for the passive checks problem. I
couldn't see any change in performance or passive checks reliability, however
I had an issue.

Twice since then I found out Nagios stopped running active checks and
processing passive checks, so I had a stale daemon that wouldn't monitor
anything apart from showing everything is fine. The first time was not so long
after the nanosleep change, right after a restart so I dismissed it as an odd
startup bug. The second time it happened was today (nagios were running fine
since 2-3 days, last restart was for config change). For no apparent reasons
it stopped running checks.

Running check_nagios manually shown that status file wasn't updated and
process count were oscillating between 3 and 6.

I'm running nagios-2-x-cvs (2006-07-07 10:11:49), last commit was for a bug I
reported:
* Bug fix for segfault during startup due to extended service definition
duplication

Here's the last entries in the log (edited). Service X is a custom script
scheduled to run every 5 minutes on some servers and reporting trough
send_nsca:

[2006-08-25 13:46:47] Caught SIGHUP, restarting... <--- ME RESTARTING NAGIOS
(STALE)
Informational Message[2006-08-25 13:15:20] Auto-save of retention data
completed successfully.
Service Ok[2006-08-25 13:15:13] SERVICE ALERT: hostx.example.com;Service
X;OK;SOFT;2;OK: Everything looks fine
Service Ok[2006-08-25 13:15:13] SERVICE ALERT: hosty.example.com;Service
X;OK;SOFT;2;OK: Everything looks fine
Service Critical[2006-08-25 13:11:29] SERVICE ALERT: hosty.example.com;Service
X;CRITICAL;SOFT;1;CRITICAL: Didn't recieved Service X results.
Service Critical[2006-08-25 13:11:29] SERVICE ALERT: hostx.example.com;Service
X;CRITICAL;SOFT;1;CRITICAL: Didn't recieved Service X results.
Informational Message[2006-08-25 13:11:20] Warning: The results of service
'Service X' on host 'hosty.example.com' are stale by 47 seconds (threshold=330
seconds). I'm forcing an immediate check of the service.
Informational Message[2006-08-25 13:11:20] Warning: The results of service
'Service X' on host 'hostx.example.com' are stale by 48 seconds (threshold=330
seconds). I'm forcing an immediate check of the service.
Informational Message[2006-08-25 13:10:20] Auto-save of retention data
completed successfully.
Informational Message[2006-08-25 13:05:20] Auto-save of retention data
completed successfully.
Informational Message[2006-08-25 13:00:21] Auto-save of retention data
completed successfully.


Thanks,

Thomas

------=_NextPart_000_0138_01C6C857.C5B521D0
Content-Type: application/x-pkcs7-signature;
name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="smime.p7s"

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIIwTCCAkkw
ggGyoAMCAQICAw+pBzANBgkqhkiG9w0BAQQFADBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhh
d3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVt
YWlsIElzc3VpbmcgQ0EwHhcNMDUxMDEzMjE1NDQ2WhcNMDYxMDEzMjE1NDQ2WjBCMR8wHQYDVQQD
ExZUaGF3dGUgRnJlZW1haWwgTWVtYmVyMR8wHQYJKoZIhvcNAQkBFhB0aG9tYXNAemFuZ28uY29t
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCpO58O2SQ+znxpvrbDsmLxepJzdphhREbvvU23
0jSS4DatcJo1W0r7FN56SmI1Bns0QQz/mKUxkbwSDtV3VURwhUtOjgM/mps1SK155dOCMCNCMVMM
S01a+qp+rdJeOhqOBWXsDoKRtln9m+rPsc5WN4eDblF6PMVmt9gS5eQXyQIDAQABoy0wKzAbBgNV
HREEFDASgRB0aG9tYXNAemFuZ28uY29tMAwGA1UdEwEB/wQCMAAwDQYJKoZIhvcNAQEEBQADgYEA
iTlTfST/7iXV8mtlka52jxntFPBTbAoOvATjBaj8VvX/sQ6/53OUsxOsHp+UBIXsQ6FJcC/ggXLw
wfreP2FUZ6xuFRjXpq3zGADlv08yHqcsf0vJct974jsM6r/LenC7xhrn5Q7f5JKSnWbZvttbATvj
Z/VMG+TYf3RhQefdiaAwggMtMIIClqADAgECAgEAMA0GCSqGSIb3DQEBBAUAMIHRMQs

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]