Page 3 of 3

Re: Scheduling queue freezes

Posted: Fri Jan 08, 2016 3:02 am
by aisebouma
rkennedy wrote:I found a post here that may correlate to the issue you are experiencing - https://support.nagios.com/forum/viewto ... 10#p112580

Can you post the contents of the file below for us to review?

Code: Select all

/etc/sysctl.conf 

Code: Select all

cat /etc/sysctl.conf
#
# /etc/sysctl.conf - Configuration file for setting system variables
# See /etc/sysctl.d/ for additional system variables.
# See sysctl.conf (5) for information.
#

#kernel.domainname = example.com

# Uncomment the following to stop low-level messages on console
#kernel.printk = 3 4 1 3

##############################################################3
# Functions previously found in netbase
#

# Uncomment the next two lines to enable Spoof protection (reverse-path filter)
# Turn on Source Address Verification in all interfaces to
# prevent some spoofing attacks
#net.ipv4.conf.default.rp_filter=1
#net.ipv4.conf.all.rp_filter=1

# Uncomment the next line to enable TCP/IP SYN cookies
# See http://lwn.net/Articles/277146/
# Note: This may impact IPv6 TCP sessions too
#net.ipv4.tcp_syncookies=1

# Uncomment the next line to enable packet forwarding for IPv4
#net.ipv4.ip_forward=1

# Uncomment the next line to enable packet forwarding for IPv6
#  Enabling this option disables Stateless Address Autoconfiguration
#  based on Router Advertisements for this host
#net.ipv6.conf.all.forwarding=1


###################################################################
# Additional settings - these settings can improve the network
# security of the host and prevent against some network attacks
# including spoofing attacks and man in the middle attacks through
# redirection. Some network environments, however, require that these
# settings are disabled so review and enable them as needed.
#
# Do not accept ICMP redirects (prevent MITM attacks)
#net.ipv4.conf.all.accept_redirects = 0
#net.ipv6.conf.all.accept_redirects = 0
# _or_
# Accept ICMP redirects only for gateways listed in our default
# gateway list (enabled by default)
# net.ipv4.conf.all.secure_redirects = 1
#
# Do not send ICMP redirects (we are not a router)
#net.ipv4.conf.all.send_redirects = 0
#
# Do not accept IP source route packets (we are not a router)
#net.ipv4.conf.all.accept_source_route = 0
#net.ipv6.conf.all.accept_source_route = 0
#
# Log Martian Packets
#net.ipv4.conf.all.log_martians = 1
#

Re: Scheduling queue freezes

Posted: Fri Jan 08, 2016 3:16 pm
by rkennedy
Can you give the instructions I linked to a try and let us know if it occurs again?
Open /etc/sysctl.conf with a text editor. Edit the file to match the following values:

# Controls the maximum size of a message, in bytes
kernel.msgmnb = 131072000

# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 131072000

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 4294967295

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 268435456

## The maximum number of messages allowed in any one message queue
kernel.msgmni = 256000


Note: If you don't have these entries in the "/etc/sysctl.conf" file, just add them to the end of the file.

After these settings are saved to the file, run:

sysctl -p

Re: Scheduling queue freezes

Posted: Tue Jan 12, 2016 3:09 am
by aisebouma
OK, I followed your instructions. I will let you know if this fixes it.

Re: Scheduling queue freezes

Posted: Tue Jan 12, 2016 10:39 am
by rkennedy
Sounds good. I'll leave this thread open, let us know if this works for you.

Re: Scheduling queue freezes

Posted: Thu Jan 14, 2016 4:35 am
by aisebouma
Unfortunately the scheduling queue froze again yesterday. I will leave it in this state in case you want to look at some files while the scheduling is not working.

Re: Scheduling queue freezes

Posted: Thu Jan 14, 2016 5:30 pm
by tmcdonald
This might be related to an NDO bug that was fixed in recent versions. What is the output of ipcs -q on your Nagios system?

Re: Scheduling queue freezes

Posted: Fri Jan 15, 2016 4:06 am
by aisebouma
tmcdonald wrote:This might be related to an NDO bug that was fixed in recent versions. What is the output of ipcs -q on your Nagios system?

Code: Select all

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages
0x68010002 65536      nagios     600        16384        16
0xc3013798 131073     nagios     600        16384        16

Re: Scheduling queue freezes

Posted: Fri Jan 15, 2016 3:32 pm
by tmcdonald
Queue looks fine, and our dev said he found nothing of note in the retention.dat.

If you are comfortable patching and recompiling Core, we have a patch that will add in some debug logging for the Compensating messages. I have attached it here, let us know if you need assistance.

Re: Scheduling queue freezes

Posted: Mon Jan 18, 2016 4:52 am
by aisebouma
tmcdonald wrote:Queue looks fine, and our dev said he found nothing of note in the retention.dat.

If you are comfortable patching and recompiling Core, we have a patch that will add in some debug logging for the Compensating messages. I have attached it here, let us know if you need assistance.
I patched it and recompiled. Now running with the patched version. I report back when the queue froze again?

Re: Scheduling queue freezes

Posted: Mon Jan 18, 2016 11:14 am
by tmcdonald
Yes. We are looking for the "DebugQueueFreeze" messages in nagios.log.