Hi All,
I have an update and some questions...
UPDATE: -
We've moved to a new dedicated server, with SSDs instead of slow HDDs. Our I/O is way better and we have not seen the "Cannot Connect to The Database" when accessing the CCM since.
However, we are experiencing problems with something called “grsec” seems to be preventing our desired programs from working properly.
We’re seeing loads of entries like this in /var/log/messages: -
Code: Select all
Feb 14 12:48:23 Nagios kernel: grsec: failed fork with errno EAGAIN by /usr/local/nagios/bin/nagios[nagios:4525] uid/euid:508/508 gid/egid:100/100, parent /sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
Feb 14 12:48:23 Nagios kernel: grsec: failed fork with errno EAGAIN by /usr/local/nagios/bin/nagios[nagios:595] uid/euid:508/508 gid/egid:100/100, parent /usr/local/nagios/bin/nagios[nagios:4525] uid/euid:508/508 gid/egid:100/100
Feb 14 12:48:23 Nagios kernel: grsec: failed fork with errno EAGAIN by /usr/local/nagios/bin/nagios[nagios:601] uid/euid:508/508 gid/egid:100/100, parent /usr/local/nagios/bin/nagios[nagios:4525] uid/euid:508/508 gid/egid:100/100
These are followed by the Nagios system logging a problem like so: -
Code: Select all
Feb 14 12:48:23 Nagios nagios: Warning: The check of service 'someservice' on host 'somehost' could not be performed due to a fork() error: 'Resource temporarily unavailable'. The check will be rescheduled.
Q1. Is "grsec" something that NagiosXi installed and can we configire it, or is there something else we can do to stop this happening?
I note that /var/log/messages is full of Nagios messages, which seem to be duplicates of those also being written to /usr/local/nagios/var/nagios.log. That seems really inefficient with regard to I/O and somewhat redundant.
I can also see that these message log entries have also triggered imuxsock rate limiting & can see log entries like this (PID 4525 was nagios): -
Code: Select all
Feb 14 12:18:08 Nagios rsyslogd-2177: imuxsock begins to drop messages from pid 4525 due to rate-limiting
Feb 14 12:18:20 Nagios rsyslogd-2177: imuxsock lost 90 messages from pid 4525 due to rate-limiting
I Googled and found [rol=
http://www.rsyslog.com/tag/rate-limiting/]this[url] which allowed me to alter the rates, so that it didn't hamper my debugging any more.
Q2. Is there something we can do to stop that additional logging to /var/log/messages?
I was also wondering if we needed to upgrade the system, just in case there were updates/fixes for the problem stuff, so I had a look at what was available with "yum update": -
Code: Select all
================================================================================================================================================
Package Arch Version Repository Size
================================================================================================================================================
Updating:
bind x86_64 32:9.8.2-0.10.rc1.el6_3.6 updates 4.0 M
bind-chroot x86_64 32:9.8.2-0.10.rc1.el6_3.6 updates 70 k
bind-libs x86_64 32:9.8.2-0.10.rc1.el6_3.6 updates 871 k
bind-utils x86_64 32:9.8.2-0.10.rc1.el6_3.6 updates 182 k
cpio x86_64 2.10-11.el6_3 updates 192 k
dbus x86_64 1:1.2.24-7.el6_3 updates 207 k
dbus-libs x86_64 1:1.2.24-7.el6_3 updates 127 k
device-mapper x86_64 1.02.74-10.el6_3.3 updates 135 k
device-mapper-event x86_64 1.02.74-10.el6_3.3 updates 88 k
device-mapper-event-libs x86_64 1.02.74-10.el6_3.3 updates 83 k
device-mapper-libs x86_64 1.02.74-10.el6_3.3 updates 163 k
dhclient x86_64 12:4.1.1-31.0.1.P1.el6.centos.1 updates 317 k
dhcp-common x86_64 12:4.1.1-31.0.1.P1.el6.centos.1 updates 141 k
epel-release noarch 6-8 epel 14 k
initscripts x86_64 9.03.31-2.el6.centos.1 updates 935 k
irqbalance x86_64 2:0.55-35.el6_3 updates 25 k
libblkid x86_64 2.17.2-12.7.el6_3 updates 112 k
libuuid x86_64 2.17.2-12.7.el6_3 updates 65 k
lvm2 x86_64 2.02.95-10.el6_3.3 updates 615 k
lvm2-libs x86_64 2.02.95-10.el6_3.3 updates 680 k
nspr x86_64 4.9.2-0.el6_3.1 updates 111 k
nss x86_64 3.13.6-2.el6_3 updates 770 k
nss-sysinit x86_64 3.13.6-2.el6_3 updates 32 k
nss-tools x86_64 3.13.6-2.el6_3 updates 730 k
nss-util x86_64 3.13.6-1.el6_3 updates 53 k
openssh x86_64 5.3p1-81.el6_3 updates 236 k
openssh-clients x86_64 5.3p1-81.el6_3 updates 358 k
openssh-server x86_64 5.3p1-81.el6_3 updates 300 k
psacct x86_64 6.3.2-63.el6_3.3 updates 68 k
python x86_64 2.6.6-29.el6_3.3 updates 4.8 M
python-libs x86_64 2.6.6-29.el6_3.3 updates 623 k
redhat-logos noarch 60.0.14-12.el6.centos updates 15 M
selinux-policy noarch 3.7.19-155.el6_3.14 updates 1.3 M
selinux-policy-targeted noarch 3.7.19-155.el6_3.14 updates 2.6 M
strace x86_64 4.5.19-1.11.el6_3.2 updates 171 k
sudo x86_64 1.7.4p5-13.el6_3 updates 423 k
tzdata noarch 2012j-1.el6 updates 453 k
util-linux-ng x86_64 2.17.2-12.7.el6_3 updates 1.5 M
Transaction Summary
================================================================================================================================================
Upgrade 38 Package(s)
Q3. I said no to the update, but is this something which we should be doing periodically?
Cheers,
--
ChrisP