Service check did not exit properly

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
karven
Posts: 4
Joined: Wed Jun 10, 2015 3:12 pm

Service check did not exit properly

Post by karven »

Im currently running nagios Version 4.0.8 under Freebsd 10.1-RELEASE, sometime I receive alert like saying that: Service check did not exit properly for the information I could gather nagios socket queue max length is too low and also nagios or check process are being kill somehow, Im looking away to fix this but I have no Idea yet :| :roll: . any help? :?:

Code: Select all

nagios@svrbsd:~ % nagios version

Nagios Core 4.0.8
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2014
License: GPL

Code: Select all

nagios@svrbsd:~ % uname -a
FreeBSD svrbsd 10.1-RELEASE-p10 FreeBSD 10.1-RELEASE-p10 #0: Wed May 13 06:54:13 UTC 2015     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

Code: Select all

nagios@svrdbs:~ % netstat -Lan
Current listen queue sizes (qlen/incqlen/maxqlen)
Proto Listen         Local Address         
tcp4  0/0/65535      72.10.166.19.80        
tcp4  0/0/10         *.587                  
tcp6  0/0/10         *.25                   
tcp4  0/0/10         *.25                   
tcp4  0/0/128        72.10.166.19.22        
unix  0/0/3          /var/spool/nagios/rw/nagios.qh
unix  0/0/1024       /tmp/spawnfcgi.sock
unix  0/0/65535      /tmp/phpfpm.sock
unix  0/0/4          /var/run/devd.pipe
unix  0/0/4          /var/run/devd.seqpacket.pipe

Code: Select all

Notification Type: PROBLEM

Service: Current Users
Host: Remotehost
Address: 192.168.10.100
State: CRITICAL

Date/Time: Wed Jun 10 16:01:24 AST 2015

Additional Info:

(Service check did not exit properly)

Code: Select all

root@svrbsd:/usr/pkg/etc/nagios/objects # tail /var/log/messages
Jun  8 14:08:57 svrbsd kernel: sonewconn: pcb 0xfffff8006a9f6c30: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
Jun  9 00:28:02 svrbsd kernel: pid 80355 (nagios), uid 1002: exited on signal 10
Jun  9 04:40:55 svrbsd kernel: pid 25287 (check_ping), uid 1002: exited on signal 11
Jun  9 15:59:29 svrbsd kernel: sonewconn: pcb 0xfffff8001f661960: Listen queue overflow: 5 already in queue awaiting acceptance (6 occurrences)
Jun  9 19:52:16 svrbsd kernel: sonewconn: pcb 0xfffff8001f444870: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
Jun 10 14:18:40 svrbsd kernel: sonewconn: pcb 0xfffff800a48690f0: Listen queue overflow: 5 already in queue awaiting acceptance (6 occurrences)
Jun 10 14:27:53 svrbsd kernel: sonewconn: pcb 0xfffff8001f93c870: Listen queue overflow: 5 already in queue awaiting acceptance (6 occurrences)
Jun 10 14:31:55 svrbsd kernel: sonewconn: pcb 0xfffff8001f46dd20: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
Jun 10 15:34:47 svrbsd kernel: sonewconn: pcb 0xfffff8001f936c30: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
Jun 10 15:39:01 svrbsd kernel: sonewconn: pcb 0xfffff8001f947a50: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
root@svrbsd:/usr/pkg/etc/nagios/objects # cat /var/log/messages|grep signal
May 30 22:19:26 svrbsd kernel: pid 65742 (check_http), uid 1002: exited on signal 11
May 30 22:21:25 svrbsd kernel: pid 66374 (check_http), uid 1002: exited on signal 11
Jun  1 03:21:00 svrbsd kernel: pid 35569 (check_http), uid 1002: exited on signal 11
Jun  1 03:22:59 svrbsd kernel: pid 36403 (check_http), uid 1002: exited on signal 11
Jun  1 03:25:00 svrbsd kernel: pid 37170 (check_http), uid 1002: exited on signal 11
Jun  1 03:25:00 svrbsd kernel: pid 37203 (check_http), uid 1002: exited on signal 11
Jun  1 03:27:00 svrbsd kernel: pid 37894 (check_http), uid 1002: exited on signal 11
Jun  1 03:29:00 svrbsd kernel: pid 38677 (check_http), uid 1002: exited on signal 11
Jun  1 03:31:00 svrbsd kernel: pid 39262 (check_http), uid 1002: exited on signal 11
Jun  1 03:33:00 svrbsd kernel: pid 40055 (check_http), uid 1002: exited on signal 11
Jun  2 19:25:46 svrbsd kernel: pid 31301 (nagios), uid 1002: exited on signal 11
Jun  2 19:27:12 svrbsd kernel: pid 31299 (nagios), uid 1002: exited on signal 11
Jun  3 05:40:58 svrbsd kernel: pid 7876 (nagios), uid 1002: exited on signal 10
Jun  3 06:58:01 svrbsd kernel: pid 7875 (nagios), uid 1002: exited on signal 11
Jun  3 11:24:19 svrbsd kernel: pid 7877 (nagios), uid 1002: exited on signal 11
Jun  3 11:24:21 svrbsd kernel: pid 7879 (nagios), uid 1002: exited on signal 10
Jun  3 11:24:25 svrbsd kernel: pid 7878 (nagios), uid 1002: exited on signal 10
Jun  3 18:13:47 svrbsd syslogd: exiting on signal 15
Jun  4 00:33:31 svrbsd kernel: pid 4157 (nagios), uid 1002: exited on signal 11
Jun  9 00:28:02 svrbsd kernel: pid 80355 (nagios), uid 1002: exited on signal 10
Jun  9 04:40:55 svrbsd kernel: pid 25287 (check_ping), uid 1002: exited on signal 11

Code: Select all

bge1: link state changed to UP
sonewconn: pcb 0xfffff8001f445e10: Listen queue overflow: 5 already in queue awaiting acceptance (1 occurrences)
sonewconn: pcb 0xfffff8001f7ebd20: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
ugen0.3: <CHICONY> at usbus0 (disconnected)
ukbd0: at uhub3, port 4, addr 3 (disconnected)
pid 4157 (nagios), uid 1002: exited on signal 11
sonewconn: pcb 0xfffff800a4869000: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f9434b0: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f943780: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f7af000: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f7eb870: Listen queue overflow: 5 already in queue awaiting acceptance (14 occurrences)
sonewconn: pcb 0xfffff8001f48d870: Listen queue overflow: 5 already in queue awaiting acceptance (13 occurrences)
sonewconn: pcb 0xfffff8001f460000: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f9363c0: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f93c4b0: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f9474b0: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff802a1aa0e10: Listen queue overflow: 5 already in queue awaiting acceptance (6 occurrences)
sonewconn: pcb 0xfffff8006a9f6c30: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
pid 80355 (nagios), uid 1002: exited on signal 10
pid 25287 (check_ping), uid 1002: exited on signal 11
sonewconn: pcb 0xfffff8001f661960: Listen queue overflow: 5 already in queue awaiting acceptance (6 occurrences)
sonewconn: pcb 0xfffff8001f444870: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff800a48690f0: Listen queue overflow: 5 already in queue awaiting acceptance (6 occurrences)
sonewconn: pcb 0xfffff8001f93c870: Listen queue overflow: 5 already in queue awaiting acceptance (6 occurrences)
sonewconn: pcb 0xfffff8001f46dd20: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f936c30: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
sonewconn: pcb 0xfffff8001f947a50: Listen queue overflow: 5 already in queue awaiting acceptance (7 occurrences)
Last edited by karven on Thu Jun 11, 2015 2:26 pm, edited 1 time in total.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Service check did not exit properly

Post by jdalrymple »

How heavily loaded is this system - e.g. what is the system load? How many hosts/services are you monitoring?

Here is the FreeBSD tuning guide heavily loaded systems: https://www.freebsd.org/doc/handbook/co ... imits.html

I suggest starting with kern.ipc.somaxconn - but there may be other parameters to look at. Alternatively, if this isn't a heavily loaded system then something else must be wrong.
karven
Posts: 4
Joined: Wed Jun 10, 2015 3:12 pm

Re: Service check did not exit properly

Post by karven »

I have try those already, but do you guy have anything related like this https://github.com/NagiosEnterprises/nr ... 33add157f4, please let me know.
/boot/loader.conf

Code: Select all

# More tuning
kern.ipc.msgmax="65536"
kern.ipc.msgmnb="65536"
kern.maxusers="2048"
/etc/sysctl.conf

Code: Select all

# Increase TCP Window size to 64K for increase in network performance
net.inet.tcp.sendspace: 65536
net.inet.tcp.recvspace: 65536
net.local.stream.sendspace=65536
net.local.stream.recvspace=65536

# Other
net.inet.icmp.maskrepl=0
net.inet.tcp.path_mtu_discovery=1
net.inet.tcp.sack.enable=1
net.inet.icmp.icmplim=1000
net.inet.tcp.syncookies=1
net.inet.ip.fw.dyn_max=16384
kern.ipc.soacceptqueue=20480
# We might consider enabling this:
net.inet.tcp.fast_finwait2_recycle=1
# This value is in milliseconds
net.inet.tcp.finwait2_timeout=30000

# kernel/memory tuning
kern.ipc.shmmax=68719476736
kern.ipc.shmall=16777216
kern.ipc.shm_use_phys=1
kern.threads.max_threads_per_proc=16384

# More kernel tuning
kern.ipc.shmall=4294967296
kern.ipc.somaxconn=65535
net.inet.ip.intr_queue_maxlen=10240
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Service check did not exit properly

Post by ssax »

After adding the sysctl.conf entries did you reboot or run:

Code: Select all

/etc/rc.d/sysctl start
?
karven
Posts: 4
Joined: Wed Jun 10, 2015 3:12 pm

Re: Service check did not exit properly

Post by karven »

Yes I did, and those were set before installing nagios. Do you have anything related to the patch on nagios https://github.com/NagiosEnterprises/nr ... 33add157f4 I think nagios is missing backlog call on FreeBSD those change is on nrpe 2.16 already.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Service check did not exit properly

Post by jdalrymple »

There is no runtime configurable option. Here is the code for lib/nsock.c:

Code: Select all

int nsock_unix(const char *path, unsigned int flags)
{
        struct sockaddr_un saun;
        struct sockaddr *sa;
        int sock = 0, mode;
        socklen_t slen;

        if(!path)
                return NSOCK_EINVAL;

        if(flags & NSOCK_TCP)
                mode = SOCK_STREAM;
        else if(flags & NSOCK_UDP)
                mode = SOCK_DGRAM;
        else
                return NSOCK_EINVAL;

        if((sock = socket(AF_UNIX, mode, 0)) < 0) {
                return NSOCK_ESOCKET;
        }

        /* set up the sockaddr_un struct and the socklen_t */
        sa = (struct sockaddr *)&saun;
        memset(&saun, 0, sizeof(saun));
        saun.sun_family = AF_UNIX;
        slen = strlen(path);
        memcpy(&saun.sun_path, path, slen);
        slen += offsetof(struct sockaddr_un, sun_path);

        /* unlink if we're supposed to, but not if we're connecting */
        if(flags & NSOCK_UNLINK && !(flags & NSOCK_CONNECT)) {
                if(unlink(path) < 0 && errno != ENOENT)
                        return NSOCK_EUNLINK;
        }

        if(flags & NSOCK_CONNECT) {
                if(connect(sock, sa, slen) < 0) {
                        close(sock);
                        return NSOCK_ECONNECT;
                }
                return sock;
        } else {
                if(bind(sock, sa, slen) < 0) {
                        close(sock);
                        return NSOCK_EBIND;
                }
        }

        if(!(flags & NSOCK_BLOCK) && fcntl(sock, F_SETFL, O_NONBLOCK) < 0)
                return NSOCK_EFCNTL;

        if(flags & NSOCK_UDP)
                return sock;

        if(listen(sock, 3) < 0) {
                close(sock);
                return NSOCK_ELISTEN;
        }

        return sock;
}
Here is the libc function for creating a socket:

Code: Select all

SYNOPSIS
     #include <sys/types.h>
     #include <sys/socket.h>

     int
     listen(int	s, int backlog);
If you replace the line

Code: Select all

        if(listen(sock, 3) < 0) {
with

Code: Select all

        if(listen(sock, 128) < 0) {
And recompile you should have a larger queue on that socket. Let us know.
karven
Posts: 4
Joined: Wed Jun 10, 2015 3:12 pm

Re: Service check did not exit properly

Post by karven »

I have update and recompile, If you guys could made a patch for FreeBSD in your new version that will be great, queue length is very low I think the default value should be a bit higher, 1K is Ok for me.
Thanks, I really appreciate your help.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Service check did not exit properly

Post by tmcdonald »

That could possibly be done - would you mind opening a separate issue on GitHub for this? That way the issue will be properly filed and won't fall through the cracks.
Former Nagios employee
Locked