Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
snapier3 wrote:What check is running every few minutes that's spiking the system?
Some service checks require more resources than others due to the plugin/nature of what's being checked.
Also I see your up 100+ and not constantly pegged on the CPU so there's progress in your tuning
hi
thank you for responding.
the checks that are running are mostly NRPE (fping) that supposed to be sent to the NRPE remote agent.
my server have 4 cpu ,16 gig ram and 100gig ssd hd.
yes, i did some tunning but it doesn't work as expected.
Does anyone here has experience with mod_gearman plugin?
Can you help me better understand the overall goal of what you're trying to accomplish?
Situation 1:
Nagios server (origin) executes the fping command against the target host (destination). The telemetry gathered via this check is evaluated to determine the target host (destination) is either up/down or within defined latency limits via Nagios.
Situation 2:
Nagios sends a command to the target host(where your agent is installed) to execute an fping to single/multiple external host via NRPE. This host (origin) then executes the fping to the external hosts (the destination) and then returns the telemetry collected (the ping times) to the Nagios server. The telemetry gathered is then evaluated to determine availability and performance of network connectivity from the orgin to destination.
now i realized that basically I've didn't really offload tasks from the master rather just replaced the tasks from many fping checks to many nrpe checks.
what I'm trying to do is to offload the tasks to remote hosts, but I don't really know the right way to do soo.
-p, --period=MSEC
In looping or counting modes (-l, -c, or -C), this parameter sets the time in milliseconds that fping waits between successive packets to an individual target. Default is 1000 and minimum is 10.
The interval time out is not an option in check_fping
-t, --timeout=MSEC
Initial target timeout in milliseconds. In the default, non-loop mode, the default timeout is 500ms, and it represents the amount of time that fping waits for a response to its first request. Successive timeouts are multiplied by the backoff factor specified with -B.
In loop/count mode, the default timeout is automatically adjusted to match the "period" value (but not more than 2000ms). You can still adjust the timeout value with this option, if you wish to, but note that setting a value larger than "period" produces inconsistent results, because the timeout value can be respected only for the last ping.
Also note that any received replies that are larger than the timeout value, will be discarded
Your task will be to create a library/script to efficiently ping a list of servers from a remote host executed via NRPE or possibly using the batch execute options (NagiosXI) is available.
I think that doing pings from the Nagios host (if it works for the OP) will be way more efficient than using nrpe.
From what I've read about check_fping and check_icmp, both claim to be able to ping multiple hosts from a single check. (I'm actually getting set to do this myself via nrpe, since I've been asked to check from various remote hosts to other remote hosts.)
As others have mentioned, doing these checks multiple times a minute seems like it is/will be an issue.