Page 2 of 2

Re: Nagios showing error 500 sometimes

Posted: Tue Jan 19, 2016 10:06 am
by zulu42
Short update:
Even with 8 CPU-cores and 16GB of RAM, the error occurs. So I rolled back to 4 CPU-cores and 12GB of RAM again.

Anyone an idea?

Re: Nagios showing error 500 sometimes

Posted: Tue Jan 19, 2016 10:38 am
by rkennedy
zulu42 wrote:Short update:

top|head -5
top - 07:21:43 up 35 days, 21:02, 4 users, load average: 8.12, 5.53, 4.29
Tasks: 173 total, 2 running, 168 sleeping, 3 stopped, 0 zombie
%Cpu(s): 5.7 us, 1.5 sy, 0.0 ni, 92.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 8011552 total, 1974816 free, 775024 used, 5261712 buff/cache
KiB Swap: 1257468 total, 1256288 free, 1180 used. 6589452 avail Mem

I've upgraded the RAM to 12GB. Still have Four CPU-cores:
top - 13:50:47 up 1:11, 2 users, load average: 0.02, 0.06, 0.11
Tasks: 184 total, 3 running, 181 sleeping, 0 stopped, 0 zombie
%Cpu(s): 6.1 us, 1.3 sy, 0.0 ni, 92.5 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 12140304 total, 11242764 free, 434120 used, 463420 buff/cache
KiB Swap: 1257468 total, 1257468 free, 0 used. 11446752 avail Mem
After you rebooted, and the load went down - did this issue occur right away again? Can you run the command ps -ef and post the result?

Re: Nagios showing error 500 sometimes

Posted: Fri Jan 22, 2016 6:08 am
by zulu42
Yes, occured right again.
Here the output of ps -ef:

Code: Select all

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Jan19 ?        00:00:05 /usr/lib/systemd/systemd --switched-root --system --deserialize 24
root         2     0  0 Jan19 ?        00:00:00 [kthreadd]
root         3     2  0 Jan19 ?        00:00:03 [ksoftirqd/0]
root         5     2  0 Jan19 ?        00:00:00 [kworker/0:0H]
root         7     2  0 Jan19 ?        00:00:08 [migration/0]
root         8     2  0 Jan19 ?        00:00:00 [rcu_bh]
root         9     2  0 Jan19 ?        00:00:00 [rcuob/0]
root        10     2  0 Jan19 ?        00:00:00 [rcuob/1]
root        11     2  0 Jan19 ?        00:00:00 [rcuob/2]
root        12     2  0 Jan19 ?        00:00:00 [rcuob/3]
root        13     2  0 Jan19 ?        00:02:47 [rcu_sched]
root        14     2  0 Jan19 ?        00:01:41 [rcuos/0]
root        15     2  0 Jan19 ?        00:00:51 [rcuos/1]
root        16     2  0 Jan19 ?        00:01:02 [rcuos/2]
root        17     2  0 Jan19 ?        00:00:43 [rcuos/3]
root        18     2  0 Jan19 ?        00:00:00 [watchdog/0]
root        19     2  0 Jan19 ?        00:00:00 [watchdog/1]
root        20     2  0 Jan19 ?        00:00:07 [migration/1]
root        21     2  0 Jan19 ?        00:00:01 [ksoftirqd/1]
root        23     2  0 Jan19 ?        00:00:00 [kworker/1:0H]
root        24     2  0 Jan19 ?        00:00:01 [watchdog/2]
root        25     2  0 Jan19 ?        00:00:07 [migration/2]
root        26     2  0 Jan19 ?        00:00:01 [ksoftirqd/2]
root        28     2  0 Jan19 ?        00:00:00 [kworker/2:0H]
root        29     2  0 Jan19 ?        00:00:00 [watchdog/3]
root        30     2  0 Jan19 ?        00:00:07 [migration/3]
root        31     2  0 Jan19 ?        00:00:02 [ksoftirqd/3]
root        33     2  0 Jan19 ?        00:00:00 [kworker/3:0H]
root        34     2  0 Jan19 ?        00:00:00 [khelper]
root        35     2  0 Jan19 ?        00:00:00 [kdevtmpfs]
root        36     2  0 Jan19 ?        00:00:00 [netns]
root        37     2  0 Jan19 ?        00:00:00 [writeback]
root        38     2  0 Jan19 ?        00:00:00 [kintegrityd]
root        39     2  0 Jan19 ?        00:00:00 [bioset]
root        40     2  0 Jan19 ?        00:00:00 [kblockd]
root        41     2  0 Jan19 ?        00:00:00 [khubd]
root        42     2  0 Jan19 ?        00:00:00 [md]
root        45     2  0 Jan19 ?        00:00:00 [khungtaskd]
root        46     2  0 Jan19 ?        00:00:00 [kswapd0]
root        47     2  0 Jan19 ?        00:00:00 [ksmd]
root        48     2  0 Jan19 ?        00:00:01 [khugepaged]
root        49     2  0 Jan19 ?        00:00:00 [fsnotify_mark]
root        50     2  0 Jan19 ?        00:00:00 [crypto]
root        58     2  0 Jan19 ?        00:00:00 [kthrotld]
root        60     2  0 Jan19 ?        00:00:00 [kmpath_rdacd]
root        62     2  0 Jan19 ?        00:00:00 [kpsmoused]
root        82     2  0 Jan19 ?        00:00:00 [deferwq]
root       105     2  0 Jan19 ?        00:00:00 [kauditd]
root       287     2  0 Jan19 ?        00:00:00 [mpt_poll_0]
root       288     2  0 Jan19 ?        00:00:00 [mpt/0]
root       289     2  0 Jan19 ?        00:00:00 [ata_sff]
root       295     2  0 Jan19 ?        00:00:00 [scsi_eh_0]
root       296     2  0 Jan19 ?        00:00:00 [scsi_tmf_0]
root       297     2  0 Jan19 ?        00:00:00 [scsi_eh_1]
root       298     2  0 Jan19 ?        00:00:00 [scsi_tmf_1]
root       299     2  0 Jan19 ?        00:00:00 [scsi_eh_2]
root       304     2  0 Jan19 ?        00:00:00 [scsi_tmf_2]
root       306     2  0 Jan19 ?        00:00:00 [ttm_swap]
root       375     2  0 Jan19 ?        00:00:00 [kdmflush]
root       376     2  0 Jan19 ?        00:00:00 [bioset]
root       383     2  0 Jan19 ?        00:00:00 [kdmflush]
root       384     2  0 Jan19 ?        00:00:00 [bioset]
root       403     2  0 Jan19 ?        00:00:00 [xfsalloc]
root       404     2  0 Jan19 ?        00:00:00 [xfs_mru_cache]
root       405     2  0 Jan19 ?        00:00:00 [xfs-buf/dm-1]
root       406     2  0 Jan19 ?        00:00:00 [xfs-data/dm-1]
root       407     2  0 Jan19 ?        00:00:00 [xfs-conv/dm-1]
root       408     2  0 Jan19 ?        00:00:00 [xfs-cil/dm-1]
root       409     2  0 Jan19 ?        00:00:24 [kworker/0:1H]
root       410     2  0 Jan19 ?        00:01:31 [xfsaild/dm-1]
root       419     2  0 Jan19 ?        00:00:05 [kworker/3:1H]
root       457     2  0 Jan19 ?        00:00:08 [kworker/1:1H]
root       466     2  0 Jan19 ?        00:00:06 [kworker/2:1H]
root       480     1  0 Jan19 ?        00:00:10 /usr/lib/systemd/systemd-journald
root       498     1  0 Jan19 ?        00:00:00 /usr/sbin/lvmetad -f
root       506     1  0 Jan19 ?        00:00:00 /usr/lib/systemd/systemd-udevd
root       559     2  0 Jan19 ?        00:00:00 [xfs-buf/sda1]
root       561     2  0 Jan19 ?        00:00:00 [xfs-data/sda1]
root       562     2  0 Jan19 ?        00:00:00 [xfs-conv/sda1]
root       563     2  0 Jan19 ?        00:00:00 [xfs-cil/sda1]
root       564     2  0 Jan19 ?        00:00:00 [xfsaild/sda1]
root       565     2  0 Jan19 ?        00:00:00 [xfs-buf/sdb1]
root       566     2  0 Jan19 ?        00:00:00 [xfs-data/sdb1]
root       567     2  0 Jan19 ?        00:00:00 [xfs-conv/sdb1]
root       568     2  0 Jan19 ?        00:00:00 [xfs-cil/sdb1]
root       569     2  0 Jan19 ?        00:00:54 [xfsaild/sdb1]
root       626     1  0 Jan19 ?        00:00:00 /sbin/auditd -n
root       648     1  0 Jan19 ?        00:00:03 /usr/sbin/rsyslogd -n
dbus       649     1  0 Jan19 ?        00:00:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       656     1  0 Jan19 ?        00:00:00 /bin/sh /usr/lib/pcsd/pcsd start
root       659     1  0 Jan19 ?        00:00:22 /usr/bin/python -Es /usr/sbin/tuned -l -P
root       660     1  0 Jan19 ?        00:00:08 /usr/sbin/irqbalance --foreground
root       663     1  0 Jan19 ?        00:00:00 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid
root       673   656  0 Jan19 ?        00:00:00 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb
root       674   673  0 Jan19 ?        00:00:32 /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb
root       678     1  0 Jan19 ?        00:00:07 /usr/sbin/sssd -D -f
root       707   678  0 Jan19 ?        00:02:26 /usr/libexec/sssd/sssd_be --domain LDAP --uid 0 --gid 0 --debug-to-files
root       749   678  0 Jan19 ?        00:00:06 /usr/libexec/sssd/sssd_nss --uid 0 --gid 0 --debug-to-files
root       750   678  0 Jan19 ?        00:00:01 /usr/libexec/sssd/sssd_pam --uid 0 --gid 0 --debug-to-files
root       756     1  0 Jan19 ?        00:00:00 /usr/lib/systemd/systemd-logind
root       765     1  0 Jan19 ?        00:00:00 /usr/sbin/NetworkManager --no-daemon
polkitd    900     1  0 Jan19 ?        00:00:00 /usr/lib/polkit-1/polkitd --no-debug
root      1204     1  0 Jan19 ?        00:00:00 /usr/sbin/sshd -D
root      1212     1  0 Jan19 ?        00:00:00 /usr/bin/rhsmcertd
root      1217     1  0 Jan19 ?        00:00:21 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
root      1222     1  0 Jan19 ?        00:00:03 /usr/sbin/automount --pid-file /run/autofs.pid
root      1228     1  0 Jan19 ?        00:00:08 sendmail: accepting connections
smmsp     1242     1  0 Jan19 ?        00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root      1282     1  0 Jan19 ?        00:00:06 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/fe496e05cfe3194a8d5107f8953f3f81.socket --xlator-option *replicate*.node-uuid=5e05ab22-84b1-479f-b9f1-91cc9b679f84
root      1293     1  0 Jan19 ?        00:17:19 /usr/sbin/glusterfsd -s nagios001 --volfile-id gv0.nagios001.mnt-data-brick -p /var/lib/glusterd/vols/gv0/run/nagios001-mnt-data-brick.pid -S /var/run/gluster/9786b85d10790bebf4efafab1dd6647e.socket --brick-name /mnt/data/brick -l /var/log/glusterfs/bricks/mnt-data-brick.log --xlator-option *-posix.glusterd-uuid=5e05ab22-84b1-479f-b9f1-91cc9b679f84 --brick-port 49153 --xlator-option gv0-server.listen-port=49153
root      1327     1  0 Jan19 ?        00:27:55 /usr/sbin/glusterfs --volfile-server=nagios001 --volfile-id=/gv0 /mnt/nagios_data
root      1404     1  0 Jan19 ?        00:00:00 /usr/sbin/crond -n
root      1433     1  0 Jan19 tty1     00:00:00 /sbin/agetty --noclear tty1 linux
root      2415  1204  0 Jan19 ?        00:00:00 sshd: XXX [priv]
g706314   2420  2415  0 Jan19 ?        00:00:01 sshd: XXX@pts/0
g706314   2421  2420  0 Jan19 pts/0    00:00:00 -bash
root      2447  2421  0 Jan19 pts/0    00:00:00 sudo -i
root      2448  2447  0 Jan19 pts/0    00:00:00 -bash
root      2505     1  0 Jan19 ?        00:40:54 corosync
root      2524     1  0 Jan19 ?        00:00:16 /usr/sbin/pacemakerd -f
haclust+  2525  2524  0 Jan19 ?        00:00:18 /usr/libexec/pacemaker/cib
root      2526  2524  0 Jan19 ?        00:00:15 /usr/libexec/pacemaker/stonithd
root      2527  2524  0 Jan19 ?        00:00:41 /usr/libexec/pacemaker/lrmd
haclust+  2528  2524  0 Jan19 ?        00:00:15 /usr/libexec/pacemaker/attrd
haclust+  2529  2524  0 Jan19 ?        00:00:13 /usr/libexec/pacemaker/pengine
haclust+  2530  2524  0 Jan19 ?        00:00:22 /usr/libexec/pacemaker/crmd
root      2997     1  0 Jan19 ?        00:00:16 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
nagios    3028     1  0 Jan19 ?        00:19:30 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
root      4517     2  0 10:49 ?        00:00:00 [kworker/2:1]
root      5128     2  0 10:50 ?        00:00:00 [kworker/3:1]
root      5594     2  0 10:51 ?        00:00:00 [kworker/0:2]
root      5767  1204  0 10:52 ?        00:00:00 sshd: XXX [priv]
g706315   6412  5767  0 10:52 ?        00:00:00 sshd: XXX@pts/1
g706315   6413  6412  0 10:52 pts/1    00:00:00 -bash
root      6475  6413  0 10:52 pts/1    00:00:00 sudo -i
root      6476  6475  0 10:52 pts/1    00:00:00 -bash
root      6849  2448  0 10:52 pts/0    00:00:00 ps -ef
apache    7080  2997  0 09:35 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache    7081  2997  0 09:35 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache    7082  2997  0 09:35 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache    7951  2997  0 09:36 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache    7994  2997  0 09:36 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache    9282  2997  0 09:38 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
root     10040     2  0 10:34 ?        00:00:00 [kworker/2:0]
root     11225     2  0 10:34 ?        00:00:00 [kworker/3:0]
root     11671     2  0 10:35 ?        00:00:00 [kworker/0:1]
apache   15003  2997  0 09:39 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
root     18697     2  0 Jan21 ?        00:00:35 [kworker/1:0]
root     20331     2  0 10:40 ?        00:00:00 [kworker/3:2]
root     21018     2  0 10:41 ?        00:00:00 [kworker/0:0]
apache   21719  2997  0 09:44 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache   23923  2997  0 09:27 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
root     24540     2  0 10:44 ?        00:00:00 [kworker/u8:0]
root     27779     2  0 09:49 ?        00:00:00 [kworker/u8:2]
root     28143     2  0 10:44 ?        00:00:00 [kworker/2:2]
root     29450     2  0 10:46 ?        00:00:00 [kworker/1:1]
root     30456  1204  0 09:11 ?        00:00:00 sshd: XXX [priv]
g705992  30499 30456  0 09:11 ?        00:00:00 sshd: XXX@pts/2
g705992  30500 30499  0 09:11 pts/2    00:00:00 -bash
root     30933 30500  0 09:12 pts/2    00:00:00 sudo -i
root     30934 30933  0 09:12 pts/2    00:00:00 -bash
apache   31474  2997  0 09:30 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid

Re: Nagios showing error 500 sometimes

Posted: Fri Jan 22, 2016 3:21 pm
by tmcdonald
Are your configuration/log files stored on a network drive? It's possible there is some disconnect periodically that makes the CGI unable to correctly determine permissions.

Re: Nagios showing error 500 sometimes

Posted: Mon Jan 25, 2016 2:06 am
by zulu42
Yes, the nagios.log and the status.dat get currently written to a glusterfs-storage in a master/master (active/active) configuration.
Also the check-configuration, host-configuration is stored on this glusterfs.
I have attached a second disk to the server, where the glusterfs is running.
Thanks for the hint. I'll move the status.dat to its original folder and see what happens. If this doesn't solve the problem, I'll also move the nagios.log to its original folder and I'll keep you updated.

Re: Nagios showing error 500 sometimes

Posted: Mon Jan 25, 2016 10:40 am
by rkennedy
Sounds good, let us know the result.

Re: Nagios showing error 500 sometimes

Posted: Thu Jan 28, 2016 2:22 am
by zulu42
Thanks again for the hint.
The glusterfs really caused the problem.
I moved the status.dat-file to its original folder on the root-disk and this already solved the problem.
I also read a post, that you really should not integrate a glusterfs into updatedb. Maybe this is also related to it.
Anyway I made a short test just because I'm interested in it:
I put the status.dat-file on an NFS-share, which works perfectly fine(, so maybe if someone plans to do this....).

Thanks again to all people who read this topic and commented on it (and helped me).
This topic can be closed now.

Re: Nagios showing error 500 sometimes

Posted: Thu Jan 28, 2016 10:54 am
by hsmith
The load looks a lot healthier, are you still running into issues?

Re: Nagios showing error 500 sometimes

Posted: Fri Jan 29, 2016 2:01 am
by zulu42
I don't have any issues since I moved the status.dat-file to the root-disk again.

Re: Nagios showing error 500 sometimes

Posted: Fri Jan 29, 2016 11:20 am
by hsmith
Good to hear. Thanks for sharing what the problem was. I'll go ahead and close this thread and mark it 'resolved'.