Nagios invoked oom-killer

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
estebanmonge
Posts: 50
Joined: Mon Feb 06, 2012 11:13 pm

Re: Nagios invoked oom-killer

Post by estebanmonge »

We don't use xinet:

With init.d I make this file:

Code: Select all

# Changelog:    Taken from Debian Project package nsca in squeeze of architecture i386
# Modified for Nagios 3.3.1
# Esteban Monge [email protected]

#!/bin/sh
### BEGIN INIT INFO
# Provides:          nsca
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
### END INIT INFO

# simple debian init script for nsca
# by sean finney <[email protected]>

DAEMON=/usr/local/nagios/bin/nsca
NAME=nsca
DESC="Nagios Service Check Acceptor"
CONF=/usr/local/nagios/etc/nsca.cfg
OPTS="--daemon -c $CONF"
PIDFILE="/var/run/nsca.pid"

###

test -f $DAEMON || exit 0

# grab an arbitrary config setting from nsca.cfg
get_config(){
        grep "^[[:space:]]*$1=" $CONF 2>/dev/null | tail | cut -d= -f2-
}

# if the pid_file is specified in the configuration file, nsca will
# take care of the pid handling for us.  if it isn't we should continue
# as we have before
PIDFILE=`get_config pid_file`
# if pidfile isn't set
if [ -z "$PIDFILE" ];  then
        # then this is the default PIDFILE
        PIDFILE="/var/run/nsca.pid"
        # run nsca in the foreground, and have s-s-d fork it for us
        OPTS="-f $OPTS"
        # and then this is how we call SSD
        SSD_STARTOPTS="--background --pidfile $PIDFILE --make-pidfile"
        SSD_STOPOPTS="--pidfile $PIDFILE"
else
        # but if pid_file is set, we don't have to do anything
        SSD_STARTOPTS="--pidfile $PIDFILE"
        SSD_STOPOPTS="--pidfile $PIDFILE"
fi

SSD_START="start-stop-daemon --start --oknodo -S $SSD_STARTOPTS --exec $DAEMON"
SSD_STOP="start-stop-daemon --stop --oknodo -K $SSD_STOPOPTS --exec $DAEMON"

die(){
        echo $@
        exit 1
}

case "$1" in
start)
        echo -n "Starting $DESC: "
        if [ ! -d "/var/run/nagios" ]; then
                mkdir -p /var/run/nagios || die "ERROR: couldn't create /var/run/nagios"
        fi
        $SSD_START -- $OPTS || die "ERROR: could not start $NAME."
        echo "$NAME."
;;
stop)
        echo -n "Stopping $DESC: "
        $SSD_STOP -- $OPTS || die "ERROR: could not stop $NAME."
        rm -f $PIDFILE
        echo "$NAME."
;;
reload|force-reload)
        echo -n "Reloading $DESC: "
        $SSD_STOP --signal HUP -- $OPTS || die "ERROR: could not reload $NAME."
        echo "$NAME."
;;
restart)
        $0 stop
        $0 start
;;
esac
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios invoked oom-killer

Post by scottwilkerson »

How frequently are these results being send to this server? I just noticed you mentioning send results per second....

Your server is likely getting flooded with results if you are sending hundreds or thousands of checks per second...
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Nagios invoked oom-killer

Post by sreinhardt »

Could you try that again, or maybe zip the pdf, it does not seem to be attached.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
estebanmonge
Posts: 50
Joined: Mon Feb 06, 2012 11:13 pm

Re: Nagios invoked oom-killer

Post by estebanmonge »

Hello I detected the problem.

I disabled send_nsca in eventhandlers (ocsp) for obsessing over services, check latency downs to 3s, when enable ocsp again latency up to 200s.

How can I reduce latency?

Code: Select all

Program-Wide Performance Information
Services Actively Checked:
	
Time Frame	Services Checked
<= 1 minute:	44 (10.5%)
<= 5 minutes:	193 (46.1%)
<= 15 minutes:	419 (100.0%)
<= 1 hour:	419 (100.0%)
Since program start:  	419 (100.0%)
	
Metric	Min.	Max.	Average
Check Execution Time:  	0.07 sec	12.45 sec	2.184 sec
Check Latency:	126.56 sec	253.60 sec	223.426 sec
Percent State Change:	0.00%	12.24%	0.33%
Services Passively Checked:
	
Time Frame	Services Checked
<= 1 minute:	0 (0.0%)
<= 5 minutes:	0 (0.0%)
<= 15 minutes:	0 (0.0%)
<= 1 hour:	0 (0.0%)
Since program start:  	0 (0.0%)
	
Metric	Min.	Max.	Average
Percent State Change:  	0.00%	0.00%	0.00%
Hosts Actively Checked:
	
Time Frame	Hosts Checked
<= 1 minute:	0 (0.0%)
<= 5 minutes:	45 (75.0%)
<= 15 minutes:	60 (100.0%)
<= 1 hour:	60 (100.0%)
Since program start:  	60 (100.0%)
	
Metric	Min.	Max.	Average
Check Execution Time:  	4.06 sec	25.37 sec	5.525 sec
Check Latency:	0.00 sec	257.19 sec	192.772 sec
Percent State Change:	0.00%	12.37%	0.90%
Hosts Passively Checked:
	
Time Frame	Hosts Checked
<= 1 minute:	0 (0.0%)
<= 5 minutes:	0 (0.0%)
<= 15 minutes:	0 (0.0%)
<= 1 hour:	0 (0.0%)
Since program start:  	0 (0.0%)
	
Metric	Min.	Max.	Average
Percent State Change:  	0.00%	0.00%	0.00%
Check Statistics:
	
Type	Last 1 Min	Last 5 Min	Last 15 Min
Active Scheduled Host Checks	11	52	155
Active On-Demand Host Checks	1	4	11
Parallel Host Checks	11	54	161
Serial Host Checks	0	0	0
Cached Host Checks	1	2	5
Passive Host Checks	0	0	0
Active Scheduled Service Checks	60	210	666
Active On-Demand Service Checks	0	0	0
Cached Service Checks	0	0	0
Passive Service Checks	0	0	0
External Commands	0	0	0
Buffer Usage:
	
Type	In Use	Max Used	Total Available
External Commands 	0	0	8192
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios invoked oom-killer

Post by scottwilkerson »

This is likely because with ocsp enabled each check is going to additionally execute your ocsp command which could be taking a long time to complete.

I would look into the details of that command and check the latency in testing executing the commands it executes.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
estebanmonge
Posts: 50
Joined: Mon Feb 06, 2012 11:13 pm

Re: Nagios invoked oom-killer

Post by estebanmonge »

I still with problems by nsca.

I need to know if in configuration file can I modify config file to limit the amount of nsca processes?

Regards
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Nagios invoked oom-killer

Post by abrist »

How many passive nsca checks are you running? ocsp has had occasional, isolated bugs, not to mention it is not working correctly with core 4. Are you running core 4?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Locked