Re: [Nagios-devel] RFC/PATCH: Handle external service check results
Posted: Fri Apr 13, 2007 10:36 am
This is a multi-part message in MIME format.
------=_NextPart_000_0059_01C77DD9.2A62ED10
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf
> Of Ethan Galstad
> Sent: April 13, 2007 7:03
> To: Nagios Developers List
> Subject: Re: [Nagios-devel] RFC/PATCH: Handle external
> service check results in seperate thread
>
> Stefan Rompf wrote:
> > Hi,
> >
> > like other people on this list, we've been bitten by the
> problem that nagios
> > fork()s subprocesses when service check results arrive via
> the external
> > command pipe. When nagios lags for example due to
> hostchecks, in most cases
> > enough forked processes pile up to bring nagios over its
> resource limits.
> > Even if this doesn't happen, results will be fed in the wrong order.
> >
> > I've developed the following solution that is quite
> different to the spool
> > directory approach:
> >
> > -passive service check results are added to
> passive_check_result_list as
> > before. However, for our use case it does not make sense to
> keep multiple
> > results for one service as soon as nagios starts lagging.
> So we have a
> > duplicate detection that keeps only the newest check result
> per service.
> > -Instead of forking subprocesses, a permanently running
> thread feeds the
> > results on passive_check_result_list back via
> write_svc_message(). So two
> > threads of the process talk to each other via a pipe, but I
> didn't want to
> > make my changes too invasive
> > -Instead of polling the command pipe every 0.5 seconds,
> select() on the file
> > descriptor is used now if there are enough
> external_command_buffer_slots.
> > Problem here was that with no writer on the pipe, select()
> endlessly signaled
> > an EOF. Fixed by opening the command pipe R/W.
> >
> > The patch has been developed on nagios 2.6 and linux,
> afterwards forward
> > ported to current CVS. It seems to work, but needs further
> testing. Even
> > compilation tests on different architectures would be
> interesting, I'm not
> > sure how widespread the tsearch()-API is.
> >
> > Thoughts?
> >
> > Stefan
>
> Sounds interesting. I'm still leaning towards the spool
> directory idea,
> as it provides from resistance to problems when Nagios isn't running
> and/or the external command file pipe fills up.
No matter what you do you can still change to select on the external command
pipe by oppening it RW. This is what I do in the OCP_daemon.
Just my 2 cents...
Thomas
------=_NextPart_000_0059_01C77DD9.2A62ED10
Content-Type: application/x-pkcs7-signature;
name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="smime.p7s"
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIIzjCCAlYw
ggG/oAMCAQICEF6WlTyD2iR9R4J/yf+4yw4wDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkEx
JTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQ
ZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA2MTAxNjE1MTgxNVoXDTA3MTAxNjE1MTgx
NVowQjEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEfMB0GCSqGSIb3DQEJARYQdGhv
bWFzQHphbmdvLmNvbTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAtkCTbSUT0qmJN70ik/vg
3cNmuRbfC18Mz7T6CNqwyb/URZ5n3PQlA0Tel2aY73QfCa4Ws1eQ+b19KJb/9IbgcRM198OZx+jY
3R+Y2/lVSsUoQaFpHYnZp9voKG+ugi3MiPChy9q/OuNfRL7IiKxjpsYKVRHrqFLWvnX9qIoxkjcC
AwEAAaMtMCswGwYDVR0RBBQwEoEQdGhvbWFzQHphbmdvLmNvbTAMBgNVHRMBAf8EAjAAMA0GCSqG
SIb3DQEBBQUAA4GBAMNB9/ulaYWECjEiG5hlAtN/ZXiaWSOOBRp4LJSfXX60VgmwQUfYHmWycMAv
yHaGz06DrMonrCHiHcpkEkntgU7NK6G95Hg7CpHyd98+zRDxBe/TTeGrePqFxQ+MbGZ4+orUPAtq
8PNFGCogpyCElTzkSP8KkYJ3cKIAaBAa2NrxMIIDLTCCApagAwIBAgIBADANBgkqhkiG9w0BAQQF
ADCB0TELMAkGA1UEBhMCWkExFTATBgNVBAgTDFdlc3Rlcm4gQ2FwZTESMBAGA1UEBxMJQ2FwZSBU
b3duMRowGAYDVQQKExFUaGF3dGUgQ29uc3VsdGluZzEoMCYGA1UECxMfQ2VydGlmaWNhdGlvbiBT
ZXJ2aWNlcyBEaXZpc2lvbjEkMCIGA1UEAxMbVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIENBMSsw
KQYJKoZIhvcNAQkBFhxwZXJzb25hbC1mcmVlbWFpbEB0aGF3dGUuY29tMB4XDTk2MDEwMTAwMDAw
MFoXDTIwMTIzMTIzNTk1OVowgdExCzAJBgNVBAYTAlpBMRUwEwYDVQQI
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
------=_NextPart_000_0059_01C77DD9.2A62ED10
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf
> Of Ethan Galstad
> Sent: April 13, 2007 7:03
> To: Nagios Developers List
> Subject: Re: [Nagios-devel] RFC/PATCH: Handle external
> service check results in seperate thread
>
> Stefan Rompf wrote:
> > Hi,
> >
> > like other people on this list, we've been bitten by the
> problem that nagios
> > fork()s subprocesses when service check results arrive via
> the external
> > command pipe. When nagios lags for example due to
> hostchecks, in most cases
> > enough forked processes pile up to bring nagios over its
> resource limits.
> > Even if this doesn't happen, results will be fed in the wrong order.
> >
> > I've developed the following solution that is quite
> different to the spool
> > directory approach:
> >
> > -passive service check results are added to
> passive_check_result_list as
> > before. However, for our use case it does not make sense to
> keep multiple
> > results for one service as soon as nagios starts lagging.
> So we have a
> > duplicate detection that keeps only the newest check result
> per service.
> > -Instead of forking subprocesses, a permanently running
> thread feeds the
> > results on passive_check_result_list back via
> write_svc_message(). So two
> > threads of the process talk to each other via a pipe, but I
> didn't want to
> > make my changes too invasive
> > -Instead of polling the command pipe every 0.5 seconds,
> select() on the file
> > descriptor is used now if there are enough
> external_command_buffer_slots.
> > Problem here was that with no writer on the pipe, select()
> endlessly signaled
> > an EOF. Fixed by opening the command pipe R/W.
> >
> > The patch has been developed on nagios 2.6 and linux,
> afterwards forward
> > ported to current CVS. It seems to work, but needs further
> testing. Even
> > compilation tests on different architectures would be
> interesting, I'm not
> > sure how widespread the tsearch()-API is.
> >
> > Thoughts?
> >
> > Stefan
>
> Sounds interesting. I'm still leaning towards the spool
> directory idea,
> as it provides from resistance to problems when Nagios isn't running
> and/or the external command file pipe fills up.
No matter what you do you can still change to select on the external command
pipe by oppening it RW. This is what I do in the OCP_daemon.
Just my 2 cents...
Thomas
------=_NextPart_000_0059_01C77DD9.2A62ED10
Content-Type: application/x-pkcs7-signature;
name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="smime.p7s"
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIIzjCCAlYw
ggG/oAMCAQICEF6WlTyD2iR9R4J/yf+4yw4wDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkEx
JTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQ
ZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA2MTAxNjE1MTgxNVoXDTA3MTAxNjE1MTgx
NVowQjEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEfMB0GCSqGSIb3DQEJARYQdGhv
bWFzQHphbmdvLmNvbTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAtkCTbSUT0qmJN70ik/vg
3cNmuRbfC18Mz7T6CNqwyb/URZ5n3PQlA0Tel2aY73QfCa4Ws1eQ+b19KJb/9IbgcRM198OZx+jY
3R+Y2/lVSsUoQaFpHYnZp9voKG+ugi3MiPChy9q/OuNfRL7IiKxjpsYKVRHrqFLWvnX9qIoxkjcC
AwEAAaMtMCswGwYDVR0RBBQwEoEQdGhvbWFzQHphbmdvLmNvbTAMBgNVHRMBAf8EAjAAMA0GCSqG
SIb3DQEBBQUAA4GBAMNB9/ulaYWECjEiG5hlAtN/ZXiaWSOOBRp4LJSfXX60VgmwQUfYHmWycMAv
yHaGz06DrMonrCHiHcpkEkntgU7NK6G95Hg7CpHyd98+zRDxBe/TTeGrePqFxQ+MbGZ4+orUPAtq
8PNFGCogpyCElTzkSP8KkYJ3cKIAaBAa2NrxMIIDLTCCApagAwIBAgIBADANBgkqhkiG9w0BAQQF
ADCB0TELMAkGA1UEBhMCWkExFTATBgNVBAgTDFdlc3Rlcm4gQ2FwZTESMBAGA1UEBxMJQ2FwZSBU
b3duMRowGAYDVQQKExFUaGF3dGUgQ29uc3VsdGluZzEoMCYGA1UECxMfQ2VydGlmaWNhdGlvbiBT
ZXJ2aWNlcyBEaXZpc2lvbjEkMCIGA1UEAxMbVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIENBMSsw
KQYJKoZIhvcNAQkBFhxwZXJzb25hbC1mcmVlbWFpbEB0aGF3dGUuY29tMB4XDTk2MDEwMTAwMDAw
MFoXDTIwMTIzMTIzNTk1OVowgdExCzAJBgNVBAYTAlpBMRUwEwYDVQQI
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]