Re: Nagios distributed monitoring
Posted: Mon Sep 14, 2015 10:45 am
1. ok I will do so. regarding this, I have workers in different timezones with proper tz settings on servers. So far this did not cause any trouble.
2. ok I will do so. one more interesting thing, I think I know what causes the high load on the worker. I examined the nagios.log once more and
there I see orphaned checks, checks from the specific worker only. But only in nagios.log and not in neb's log. And I think the cause of the high
cpu utilization is that nagios fires the checks like crazy on the worker. This puts the cpu utilization to the sky on the worker. Now I am not sure
why nagios sees those checks as orphaned checks.
So in summary I have 2 distinct problems.
I. when check_results queue getting "full" (not checks are processed) this is very rarely.
II. the current one: the checks from one of my workers are lost... at least nagios sees them as orphaned so reschedule the checks and load grows to 60 or above
.
2. ok I will do so. one more interesting thing, I think I know what causes the high load on the worker. I examined the nagios.log once more and
there I see orphaned checks, checks from the specific worker only. But only in nagios.log and not in neb's log. And I think the cause of the high
cpu utilization is that nagios fires the checks like crazy on the worker. This puts the cpu utilization to the sky on the worker. Now I am not sure
why nagios sees those checks as orphaned checks.
So in summary I have 2 distinct problems.
I. when check_results queue getting "full" (not checks are processed) this is very rarely.
II. the current one: the checks from one of my workers are lost... at least nagios sees them as orphaned so reschedule the checks and load grows to 60 or above