Page 2 of 3

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 12:58 pm
by matt.lilek
Well Scott, tried logging in again and have been able to login. All appears to be running as normal however it seems to firing a whole bunch of erroneous emails. I think all of this kinda started with a major outage so it might be still catching up with all the down hosts and services. Can we just purge all those and start out fresh on the alerts. Please let me know what i need to do. Thank you.

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 3:07 pm
by tgriep
Your server is using the Postgress database, so you would run this to truncate the SQL tables to clear out the data while the services are not running.
Run them all as root.

Code: Select all

service crond stop
service nagios stop
service ndo2db stop
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | psql nagiosxi nagiosxi
service ndo2db start
service nagios start
service crond start
Your previous post shows the /var/nagiosramdisk is full and we need to look at that.
Can you run the following commands as root and post the output?

Code: Select all

ps -ef --cols=300
tail -50 /usr/local/nagios/var/perfdata.log
tail -50 /usr/local/nagios/var/npcd.log
ls /var/nagiosramdisk/spool/xidpe | wc -l
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.log
Thanks

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 4:00 pm
by matt.lilek
Hello Tom,

Happy New Year! How have you been keeping? Was gonna ask about the RAMDISK but glad you brought it up.

Code: Select all

[root@att1-nag1 mlilek]# ps -ef --cols=300
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Jan07 ?        00:00:01 /sbin/init
root         2     0  0 Jan07 ?        00:00:00 [kthreadd]
root         3     2  0 Jan07 ?        00:00:10 [migration/0]
root         4     2  0 Jan07 ?        00:00:02 [ksoftirqd/0]
root         5     2  0 Jan07 ?        00:00:00 [stopper/0]
root         6     2  0 Jan07 ?        00:00:00 [watchdog/0]
root         7     2  0 Jan07 ?        00:00:09 [migration/1]
root         8     2  0 Jan07 ?        00:00:00 [stopper/1]
root         9     2  0 Jan07 ?        00:00:01 [ksoftirqd/1]
root        10     2  0 Jan07 ?        00:00:00 [watchdog/1]
root        11     2  0 Jan07 ?        00:00:13 [migration/2]
root        12     2  0 Jan07 ?        00:00:00 [stopper/2]
root        13     2  0 Jan07 ?        00:00:00 [ksoftirqd/2]
root        14     2  0 Jan07 ?        00:00:00 [watchdog/2]
root        15     2  0 Jan07 ?        00:00:13 [migration/3]
root        16     2  0 Jan07 ?        00:00:00 [stopper/3]
root        17     2  0 Jan07 ?        00:00:00 [ksoftirqd/3]
root        18     2  0 Jan07 ?        00:00:00 [watchdog/3]
root        19     2  0 Jan07 ?        00:00:12 [migration/4]
root        20     2  0 Jan07 ?        00:00:00 [stopper/4]
root        21     2  0 Jan07 ?        00:00:00 [ksoftirqd/4]
root        22     2  0 Jan07 ?        00:00:00 [watchdog/4]
root        23     2  0 Jan07 ?        00:00:12 [migration/5]
root        24     2  0 Jan07 ?        00:00:00 [stopper/5]
root        25     2  0 Jan07 ?        00:00:00 [ksoftirqd/5]
root        26     2  0 Jan07 ?        00:00:00 [watchdog/5]
root        27     2  0 Jan07 ?        00:00:32 [events/0]
root        28     2  0 Jan07 ?        00:00:02 [events/1]
root        29     2  0 Jan07 ?        00:00:02 [events/2]
root        30     2  0 Jan07 ?        00:00:02 [events/3]
root        31     2  0 Jan07 ?        00:00:03 [events/4]
root        32     2  0 Jan07 ?        00:00:05 [events/5]
root        33     2  0 Jan07 ?        00:00:00 [events/0]
ls /var/nagiosramdisk/spool/xidpe | wc -l
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
tail -50 /root        34     2  0 Jan07 ?        00:00:00 [events/1]
var/lib/pgsql/data/pg_log/postgresql-Tue.logroot        35     2  0 Jan07 ?        00:00:00 [events/2]
root        36     2  0 Jan07 ?        00:00:00 [events/3]
root        37     2  0 Jan07 ?        00:00:00 [events/4]
root        38     2  0 Jan07 ?        00:00:00 [events/5]
root        39     2  0 Jan07 ?        00:00:00 [events_long/0]
root        40     2  0 Jan07 ?        00:00:00 [events_long/1]
root        41     2  0 Jan07 ?        00:00:00 [events_long/2]
root        42     2  0 Jan07 ?        00:00:00 [events_long/3]
root        43     2  0 Jan07 ?        00:00:00 [events_long/4]
root        44     2  0 Jan07 ?        00:00:00 [events_long/5]
root        45     2  0 Jan07 ?        00:00:00 [events_power_ef]
root        46     2  0 Jan07 ?        00:00:00 [events_power_ef]
root        47     2  0 Jan07 ?        00:00:00 [events_power_ef]
root        48     2  0 Jan07 ?        00:00:00 [events_power_ef]
root        49     2  0 Jan07 ?        00:00:00 [events_power_ef]
root        50     2  0 Jan07 ?        00:00:00 [events_power_ef]
root        51     2  0 Jan07 ?        00:00:00 [cgroup]
root        52     2  0 Jan07 ?        00:00:00 [khelper]
root        53     2  0 Jan07 ?        00:00:00 [netns]
root        54     2  0 Jan07 ?        00:00:00 [async/mgr]
root        55     2  0 Jan07 ?        00:00:00 [pm]
root        56     2  0 Jan07 ?        00:00:00 [sync_supers]
root        57     2  0 Jan07 ?        00:00:00 [bdi-default]
root        58     2  0 Jan07 ?        00:00:00 [kintegrityd/0]
root        59     2  0 Jan07 ?        00:00:00 [kintegrityd/1]
root        60     2  0 Jan07 ?        00:00:00 [kintegrityd/2]
root        61     2  0 Jan07 ?        00:00:00 [kintegrityd/3]
root        62     2  0 Jan07 ?        00:00:00 [kintegrityd/4]
root        63     2  0 Jan07 ?        00:00:00 [kintegrityd/5]
root        64     2  0 Jan07 ?        00:00:05 [kblockd/0]
root        65     2  0 Jan07 ?        00:00:10 [kblockd/1]
root        66     2  0 Jan07 ?        00:00:10 [kblockd/2]
root        67     2  0 Jan07 ?        00:00:10 [kblockd/3]
root        68     2  0 Jan07 ?        00:00:10 [kblockd/4]
root        69     2  0 Jan07 ?        00:00:10 [kblockd/5]
root        70     2  0 Jan07 ?        00:00:00 [kacpid]
root        71     2  0 Jan07 ?        00:00:00 [kacpi_notify]
root        72     2  0 Jan07 ?        00:00:00 [kacpi_hotplug]
root        73     2  0 Jan07 ?        00:00:00 [ata_aux]
root        74     2  0 Jan07 ?        00:00:00 [ata_sff/0]
root        75     2  0 Jan07 ?        00:00:00 [ata_sff/1]
root        76     2  0 Jan07 ?        00:00:00 [ata_sff/2]
root        77     2  0 Jan07 ?        00:00:00 [ata_sff/3]
root        78     2  0 Jan07 ?        00:00:00 [ata_sff/4]
root        79     2  0 Jan07 ?        00:00:00 [ata_sff/5]
root        80     2  0 Jan07 ?        00:00:00 [ksuspend_usbd]
root        81     2  0 Jan07 ?        00:00:00 [khubd]
root        82     2  0 Jan07 ?        00:00:00 [kseriod]
root        83     2  0 Jan07 ?        00:00:00 [md/0]
root        84     2  0 Jan07 ?        00:00:00 [md/1]
root        85     2  0 Jan07 ?        00:00:00 [md/2]
root        86     2  0 Jan07 ?        00:00:00 [md/3]
root        87     2  0 Jan07 ?        00:00:00 [md/4]
root        88     2  0 Jan07 ?        00:00:00 [md/5]
root        89     2  0 Jan07 ?        00:00:00 [md_misc/0]
root        90     2  0 Jan07 ?        00:00:00 [md_misc/1]
root        91     2  0 Jan07 ?        00:00:00 [md_misc/2]
root        92     2  0 Jan07 ?        00:00:00 [md_misc/3]
root        93     2  0 Jan07 ?        00:00:00 [md_misc/4]
root        94     2  0 Jan07 ?        00:00:00 [md_misc/5]
root        95     2  0 Jan07 ?        00:00:00 [linkwatch]
root        98     2  0 Jan07 ?        00:00:00 [khungtaskd]
root        99     2  0 Jan07 ?        00:00:00 [lru-add-drain/0]
root       100     2  0 Jan07 ?        00:00:00 [lru-add-drain/1]
root       101     2  0 Jan07 ?        00:00:00 [lru-add-drain/2]
root       102     2  0 Jan07 ?        00:00:00 [lru-add-drain/3]
root       103     2  0 Jan07 ?        00:00:00 [lru-add-drain/4]
root       104     2  0 Jan07 ?        00:00:00 [lru-add-drain/5]
root       105     2  0 Jan07 ?        00:01:27 [kswapd0]
root       106     2  0 Jan07 ?        00:00:00 [ksmd]
root       107     2  0 Jan07 ?        00:00:52 [khugepaged]
root       108     2  0 Jan07 ?        00:00:00 [aio/0]
root       109     2  0 Jan07 ?        00:00:00 [aio/1]
root       110     2  0 Jan07 ?        00:00:00 [aio/2]
root       111     2  0 Jan07 ?        00:00:00 [aio/3]
root       112     2  0 Jan07 ?        00:00:00 [aio/4]
root       113     2  0 Jan07 ?        00:00:00 [aio/5]
root       114     2  0 Jan07 ?        00:00:00 [crypto/0]
root       115     2  0 Jan07 ?        00:00:00 [crypto/1]
root       116     2  0 Jan07 ?        00:00:00 [crypto/2]
root       117     2  0 Jan07 ?        00:00:00 [crypto/3]
root       118     2  0 Jan07 ?        00:00:00 [crypto/4]
root       119     2  0 Jan07 ?        00:00:00 [crypto/5]
root       126     2  0 Jan07 ?        00:00:00 [kthrotld/0]
root       127     2  0 Jan07 ?        00:00:00 [kthrotld/1]
root       128     2  0 Jan07 ?        00:00:00 [kthrotld/2]
root       129     2  0 Jan07 ?        00:00:00 [kthrotld/3]
root       130     2  0 Jan07 ?        00:00:00 [kthrotld/4]
root       131     2  0 Jan07 ?        00:00:00 [kthrotld/5]
root       132     2  0 Jan07 ?        00:00:00 [pciehpd]
root       134     2  0 Jan07 ?        00:00:00 [kpsmoused]
root       135     2  0 Jan07 ?        00:00:00 [usbhid_resumer]
root       136     2  0 Jan07 ?        00:00:00 [deferwq]
root       168     2  0 Jan07 ?        00:00:00 [kdmremove]
root       169     2  0 Jan07 ?        00:00:00 [kstriped]
root       198     2  0 Jan07 ?        00:00:00 [ttm_swap]
root       397     2  0 Jan07 ?        00:00:00 [scsi_eh_0]
root       398     2  0 Jan07 ?        00:00:00 [scsi_eh_1]
root       403     2  0 Jan07 ?        00:00:01 [mpt_poll_0]
root       404     2  0 Jan07 ?        00:00:00 [mpt/0]
root       405     2  0 Jan07 ?        00:00:00 [scsi_eh_2]
root       461     2  0 Jan07 ?        00:00:00 [kdmflush]
root       462     2  0 Jan07 ?        00:00:00 [kdmflush]
root       481     2  0 Jan07 ?        00:01:05 [jbd2/dm-0-8]
root       482     2  0 Jan07 ?        00:00:00 [ext4-dio-unwrit]
root       559     1  0 Jan07 ?        00:00:00 /sbin/udevd -d
root       726     2  0 Jan07 ?        00:00:00 [vmmemctl]
root       883   559  0 Jan07 ?        00:00:00 /sbin/udevd -d
root       891   559  0 Jan07 ?        00:00:00 /sbin/udevd -d
root       915     2  0 Jan07 ?        00:00:00 [jbd2/sda1-8]
root       916     2  0 Jan07 ?        00:00:00 [ext4-dio-unwrit]
root       985     2  0 Jan07 ?        00:00:01 [kauditd]
root      1193     2  0 Jan07 ?        00:02:53 [flush-253:0]
root      1603     1  0 Jan07 ?        00:00:02 auditd
root      1625     1  0 Jan07 ?        00:00:11 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
dbus      1640     1  0 Jan07 ?        00:00:00 dbus-daemon --system
root      1684     1  0 Jan07 ?        00:00:01 /usr/sbin/snmptrapd -Ln -p /var/run/snmptrapd.pid
root      1696     1  0 Jan07 ?        00:00:00 /usr/bin/perl /usr/sbin/snmptt --daemon
snmptt    1697  1696  0 Jan07 ?        00:00:01 /usr/bin/perl /usr/sbin/snmptt --daemon
root      1714     1  0 Jan07 ?        00:00:00 /usr/sbin/sshd
root      1725     1  0 Jan07 ?        00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
ntp       1753     1  0 Jan07 ?        00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root      2507  1714  0 15:57 ?        00:00:00 sshd: mlilek [priv]
postgres  3553     1  0 11:15 ?        00:00:05 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
postgres  3627  3553  0 11:15 ?        00:00:02 postgres: logger process        
postgres  3637  3553  0 11:15 ?        00:00:46 postgres: writer process        
postgres  3638  3553  0 11:15 ?        00:00:02 postgres: wal writer process    
postgres  3639  3553  0 11:15 ?        00:00:01 postgres: autovacuum launcher process
postgres  3640  3553  0 11:15 ?        00:00:10 postgres: stats collector process
postgres  3677  3553  0 11:15 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(34454) idle
mlilek    3957  2507  0 15:57 ?        00:00:00 sshd: mlilek@pts/0
mlilek    4052  3957  0 15:57 pts/0    00:00:00 -bash
root      4209  4052  0 15:57 pts/0    00:00:00 su root
postgres  4592  3553  0 11:16 ?        00:00:07 postgres: nagiosxi nagiosxi ::1(34520) idle
root      5789  4209  0 15:57 pts/0    00:00:00 bash
root      6448     1  0 15:58 ?        00:00:00 CROND
root      6449     1  0 15:58 ?        00:00:00 CROND
root      6452     1  0 15:58 ?        00:00:00 CROND
nagios    6457  6448  0 15:58 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios    6460  6449  0 15:58 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios    6464  6457  0 15:58 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios    6467  6460  0 15:58 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
nagios    6472  6452  0 15:58 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios    6474  6472  0 15:58 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
postgres  6484  3553  0 15:58 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(58382) idle
postgres  6486  3553  0 15:58 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(58388) idle
postgres  6544  3553  0 15:58 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(58430) idle
nagios    7519     1  0 15:58 ?        00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios    7557     1 15 15:58 ?        00:00:02 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    7558  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7559  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7560  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7561  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7562  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7563  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7564  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7565  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7566  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    7567  7519  2 15:58 ?        00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios    7568  7567 33 15:58 ?        00:00:05 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios    7646  7557  0 15:58 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root      8073     1  0 15:58 ?        00:00:00 crond
nagios    8236  7560  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H fmm-its1.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8304  7558  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.30.1.95 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8305  7560  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.30.5.127 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8306  7559  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.30.5.19 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8307  7561  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.35.254.49 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8308  7562  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.30.5.93 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8312  7564  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.41.57.38 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8316  7561  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.58.105.5 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8333  7559  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H tpa1-as1.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8334  7561  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H tpa1-vh2-ilo.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8335  7562  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H tpa1-its1.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8437  7566  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H sfe-sw-core1.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8496  7559  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H chi2-fw1.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8745  7564  3 15:58 ?        00:00:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_snmp_storage.pl -H vdc1-cwprn1.global.amec.com -C 321AmEc! --v2c -m Physical Memory -w 90 -c 95 -f
nagios    8849  7561  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H gvl-vh3-ilo.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8886  7566  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.31.209.8 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    8940  7559  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H mtl-ipt-vh1-cimc.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    9087  7565  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H brl-vh3-ilo.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    9169  7559  5 15:58 ?        00:00:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_snmp_storage.pl -H sas-sp3.am.int.amec.com -C 321AmEc! --v2c -m Physical Memory -w 95 -c 98 -f
nagios    9242  7561  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H nhl-vw-gis1.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    9254  7565  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.14.25.254 -w 3000.0 80  -c 5000.0 100  -p 5
nagios    9258  7559  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H sgo-vrep1.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
nagios    9264  7565  0 15:58 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H van-as8.global.amec.com -w 3000.0 80  -c 5000.0 100  -p 5
root      9281  5789  0 15:58 pts/0    00:00:00 ps -ef --cols=300
postgres  9453  3553  0 11:16 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(34554) idle
postgres 11135  3553  0 11:16 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(34560) idle
postgres 14492  3553  0 11:16 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(34612) idle
postgres 15164  3553  0 11:16 ?        00:00:07 postgres: nagiosxi nagiosxi ::1(34644) idle
postgres 15270  3553  0 11:16 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(34652) idle
postgres 16424  3553  0 11:16 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(34692) idle
postgres 16813  3553  0 11:16 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(34720) idle
postgres 18457  3553  0 11:17 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(34858) idle
postgres 29336  3553  0 11:19 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(35240) idle
postfix  32802 35412  0 15:10 ?        00:00:00 pickup -l -t fifo -u
nagios   35279     1  0 Jan07 ?        00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
root     35412     1  0 Jan07 ?        00:00:03 /usr/libexec/postfix/master
postfix  35427 35412  0 Jan07 ?        00:00:00 qmgr -l -t fifo -u
496      35432     1  0 Jan07 ?        00:00:00 shellinaboxd -u shellinabox -g shellinabox --cert=/var/lib/shellinabox --port=7878 --background=/var/run/shellinaboxd.pid --disable-ssl-menu -s /:SSH --localhost-only --css white-on-black.css
496      35433 35432  0 Jan07 ?        00:00:00 shellinaboxd -u shellinabox -g shellinabox --cert=/var/lib/shellinabox --port=7878 --background=/var/run/shellinaboxd.pid --disable-ssl-menu -s /:SSH --localhost-only --css white-on-black.css
nagios   36567     1  0 Jan07 ?        00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
root     37889     1  0 Jan07 tty1     00:00:00 /sbin/mingetty /dev/tty1
root     37891     1  0 Jan07 tty2     00:00:00 /sbin/mingetty /dev/tty2
root     37893     1  0 Jan07 tty3     00:00:00 /sbin/mingetty /dev/tty3
root     37895     1  0 Jan07 tty4     00:00:00 /sbin/mingetty /dev/tty4
root     37897     1  0 Jan07 tty5     00:00:00 /sbin/mingetty /dev/tty5
root     37899     1  0 Jan07 tty6     00:00:00 /sbin/mingetty /dev/tty6
apache   40298 57225  0 12:49 ?        00:01:40 /usr/sbin/httpd
apache   40901 57225  0 12:49 ?        00:01:40 /usr/sbin/httpd
apache   41261 57225  0 12:49 ?        00:01:38 /usr/sbin/httpd
postgres 41264  3553  0 12:49 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(38586) idle
postgres 41345  3553  0 12:49 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(38592) idle
postgres 41350  3553  0 12:49 ?        00:00:05 postgres: nagiosxi nagiosxi ::1(38598) idle
apache   47878 57225  0 11:33 ?        00:01:45 /usr/sbin/httpd
postgres 49553  3553  0 11:33 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(38902) idle
root     56411     1  0 15:55 ?        00:00:00 CROND
nagios   56423 56411  0 15:55 ?        00:00:02 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
nagios   56812 56423  0 15:55 ?        00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
nagios   56813 56423  0 15:55 ?        00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
nagios   56814 56423  0 15:55 ?        00:00:01 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
nagios   56816 56423  0 15:55 ?        00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
root     57036     1  0 09:21 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysql
mysql    57141 57036  0 09:21 ?        00:00:05 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
nagios   57193     1  0 09:21 ?        00:00:05 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
root     57225     1  0 09:21 ?        00:00:00 /usr/sbin/httpd
apache   57227 57225  0 09:21 ?        00:01:59 /usr/sbin/httpd
apache   57228 57225  0 09:21 ?        00:01:54 /usr/sbin/httpd
apache   57229 57225  0 09:21 ?        00:01:57 /usr/sbin/httpd
apache   57230 57225  0 09:21 ?        00:01:54 /usr/sbin/httpd
apache   57231 57225  0 09:21 ?        00:01:54 /usr/sbin/httpd
apache   57232 57225  0 09:21 ?        00:02:00 /usr/sbin/httpd
apache   57233 57225  0 09:21 ?        00:02:00 /usr/sbin/httpd
apache   57234 57225  0 09:21 ?        00:01:57 /usr/sbin/httpd
root     57339 56411  0 15:55 ?        00:00:00 /usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t -f root
root     57340 57339  0 15:55 ?        00:00:00 /usr/sbin/postdrop -r
apache   58276 57225  0 09:21 ?        00:02:00 /usr/sbin/httpd
apache   58899 57225  0 09:21 ?        00:02:00 /usr/sbin/httpd
apache   64853 57225  0 09:36 ?        00:02:00 /usr/sbin/httpd
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/perfdata.log
2016-11-20 06:35:01 [8418] [0] *** Timeout while processing Host: "msa-meta.am.int.amec.com" Service: "Ping"
2016-11-20 06:35:01 [8418] [0] *** process_perfdata.pl terminated on signal ALRM
2017-01-13 18:26:27 [54467] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Please check your npcd.cfg
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1484349842.perfdata.service-PID-54467 deleted
2017-01-13 18:26:28 [54467] [0] *** Timeout while processing Host: "npl-vpn-edge.global.amec.com" Service: "Ping"
2017-01-13 18:26:28 [54467] [0] *** process_perfdata.pl terminated on signal ALRM
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Please check your npcd.cfg
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1499439947.perfdata.service-PID-48730 deleted
2017-07-07 11:07:24 [48730] [0] *** Timeout while processing Host: "att1-vw-as5.am.int.amec.com" Service: "Ping"
2017-07-07 11:07:24 [48730] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150021.perfdata.host-PID-43579 deleted
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150022.perfdata.service-PID-43580 deleted
2018-11-13 18:02:06 [43580] [0] *** Timeout while processing Host: "cin-cue.global.amec.com" Service: "Ping"
2018-11-13 18:02:06 [43579] [0] *** Timeout while processing Host: "mty-gdc0.global.amec.com" Service: "_HOST_"
2018-11-13 18:02:06 [43580] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150036.perfdata.host-PID-44396 deleted
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** Timeout while processing Host: "edc01-ora10p1.global.amec.com" Service: "_HOST_"
2018-11-13 18:03:39 [44396] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Deleting cups -ef --cols=300
tail -50 /usr/local/nagios/var/perfdata.log
tail -50 /usr/local/nagios/var/npcd.log
ls /var/nagiosramdisk/spool/xidpe | wc -l
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.logrrent file to avoid NPCD loops
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150037.perfdata.service-PID-44397 deleted
2018-11-13 18:03:39 [44397] [0] *** Timeout while processing Host: "edm-plt1.global.amec.com" Service: "Ping"
2018-11-13 18:03:39 [44397] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.host-PID-23960 deleted
2018-11-15 08:16:48 [23960] [0] *** Timeout while processing Host: "sas1-sw-core1.global.amec.com" Service: "_HOST_"
2018-11-15 08:16:48 [23960] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.service-PID-23961 deleted
2018-11-15 08:16:50 [23961] [0] *** Timeout while processing Host: "cal-sw-10a.global.amec.com" Service: "Ping"
2018-11-15 08:16:50 [23961] [0] *** process_perfdata.pl terminated on signal ALRM
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/npcd.log
[01-03-2019 00:45:58] NPCD: WARN: MAX load reached: load 200.140000/80.000000 at i=1
[01-03-2019 00:46:10] NPCD: WARN: MAX load reached: load 177.340000/80.000000 at i=1
[01-03-2019 00:46:22] NPCD: WARN: MAX load reached: load 153.090000/80.000000 at i=1
[01-03-2019 00:46:34] NPCD: WARN: MAX load reached: load 133.030000/80.000000 at i=1
[01-03-2019 00:46:46] NPCD: WARN: MAX load reached: load 105.550000/80.000000 at i=1
[01-03-2019 00:46:58] NPCD: WARN: MAX load reached: load 91.100000/80.000000 at i=1
[01-03-2019 01:29:40] NPCD: WARN: MAX load reached: load 95.060000/80.000000 at i=0
[01-03-2019 01:29:52] NPCD: WARN: MAX load reached: load 125.580000/80.000000 at i=1
[01-03-2019 01:30:04] NPCD: WARN: MAX load reached: load 147.990000/80.000000 at i=1
[01-03-2019 01:30:16] NPCD: WARN: MAX load reached: load 162.670000/80.000000 at i=1
[01-03-2019 01:30:28] NPCD: WARN: MAX load reached: load 183.580000/80.000000 at i=1
[01-03-2019 01:30:40] NPCD: WARN: MAX load reached: load 192.930000/80.000000 at i=1
[01-03-2019 01:30:52] NPCD: WARN: MAX load reached: load 191.460000/80.000000 at i=1
[01-03-2019 01:31:04] NPCD: WARN: MAX load reached: load 204.210000/80.000000 at i=1
[01-03-2019 01:31:16] NPCD: WARN: MAX load reached: load 211.290000/80.000000 at i=1
[01-03-2019 01:31:28] NPCD: WARN: MAX load reached: load 197.990000/80.000000 at i=1
[01-03-2019 01:31:40] NPCD: WARN: MAX load reached: load 177.180000/80.000000 at i=1
[01-03-2019 01:31:52] NPCD: WARN: MAX load reached: load 143.340000/80.000000 at i=1
[01-03-2019 01:32:04] NPCD: WARN: MAX load reached: load 129.030000/80.000000 at i=1
[01-03-2019 01:32:16] NPCD: WARN: MAX load reached: load 110.880000/80.000000 at i=1
[01-03-2019 01:32:28] NPCD: WARN: MAX load reached: load 89.070000/80.000000 at i=1
[01-07-2019 09:19:47] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 09:22:20] NPCD: npcd Daemon (0.4.14) started with PID=40490
[01-07-2019 09:22:20] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 09:22:20] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 17:46:39] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 17:47:12] NPCD: npcd Daemon (0.4.14) started with PID=43472
[01-07-2019 17:47:12] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 17:47:12] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:05:16] NPCD: npcd Daemon (0.4.14) started with PID=2079
[01-07-2019 18:05:16] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:05:16] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:18:40] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:25:04] NPCD: npcd Daemon (0.4.14) started with PID=15756
[01-07-2019 18:25:04] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:25:04] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:38:37] NPCD: npcd Daemon (0.4.14) started with PID=36520
[01-07-2019 18:38:37] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:38:37] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:49:33] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:50:09] NPCD: npcd Daemon (0.4.14) started with PID=29586
[01-07-2019 18:50:09] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:50:09] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 19:14:07] NPCD: npcd Daemon (0.4.14) started with PID=36536
[01-07-2019 19:14:07] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 19:14:07] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-08-2019 09:20:37] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-08-2019 09:21:02] NPCD: npcd Daemon (0.4.14) started with PID=57193
[01-08-2019 09:21:02] NPCD: Please have a look at 'npcd -V' to get license information
[01-08-2019 09:21:02] NPCD: HINT: load_threshold is enabled - ('80.000000')
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/xidpe | wc -l
0
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/perfdata/ | wc -l
2
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/checkresults/ | wc -l
0
[root@att1-nag1 mlilek]# tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.logps -ef --cols=300
tail: option used in invalid context -- 5
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/perfdata.log
2016-11-20 06:35:01 [8418] [0] *** Timeout while processing Host: "msa-meta.am.int.amec.com" Service: "Ping"
2016-11-20 06:35:01 [8418] [0] *** process_perfdata.pl terminated on signal ALRM
2017-01-13 18:26:27 [54467] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Please check your npcd.cfg
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1484349842.perfdata.service-PID-54467 deleted
2017-01-13 18:26:28 [54467] [0] *** Timeout while processing Host: "npl-vpn-edge.global.amec.com" Service: "Ping"
2017-01-13 18:26:28 [54467] [0] *** process_perfdata.pl terminated on signal ALRM
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Please check your npcd.cfg
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1499439947.perfdata.service-PID-48730 deleted
2017-07-07 11:07:24 [48730] [0] *** Timeout while processing Host: "att1-vw-as5.am.int.amec.com" Service: "Ping"
2017-07-07 11:07:24 [48730] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150021.perfdata.host-PID-43579 deleted
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150022.perfdata.service-PID-43580 deleted
2018-11-13 18:02:06 [43580] [0] *** Timeout while processing Host: "cin-cue.global.amec.com" Service: "Ping"
2018-11-13 18:02:06 [43579] [0] *** Timeout while processing Host: "mty-gdc0.global.amec.com" Service: "_HOST_"
2018-11-13 18:02:06 [43580] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150036.perfdata.host-PID-44396 deleted
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** Timeout while processing Host: "edc01-ora10p1.global.amec.com" Service: "_HOST_"
2018-11-13 18:03:39 [44396] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150037.perfdata.service-PID-44397 deleted
2018-11-13 18:03:39 [44397] [0] *** Timeout while processing Host: "edm-plt1.global.amec.com" Service: "Ping"
2018-11-13 18:03:39 [44397] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.host-PID-23960 deleted
2018-11-15 08:16:48 [23960] [0] *** Timeout while processing Host: "sas1-sw-core1.global.amec.com" Service: "_HOST_"
2018-11-15 08:16:48 [23960] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.service-PID-23961 deleted
2018-11-15 08:16:50 [23961] [0] *** Timeout while processing Host: "cal-sw-10a.global.amec.com" Service: "Ping"
2018-11-15 08:16:50 [23961] [0] *** process_perfdata.pl terminated on signal ALRM
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/npcd.log
[01-03-2019 00:45:58] NPCD: WARN: MAX load reached: load 200.140000/80.000000 at i=1
[01-03-2019 00:46:10] NPCD: WARN: MAX load reached: load 177.340000/80.000000 at i=1
[01-03-2019 00:46:22] NPCD: WARN: MAX load reached: load 153.090000/80.000000 at i=1
[01-03-2019 00:46:34] NPCD: WARN: MAX load reached: load 133.030000/80.000000 at i=1
[01-03-2019 00:46:46] NPCD: WARN: MAX load reached: load 105.550000/80.000000 at i=1
[01-03-2019 00:46:58] NPCD: WARN: MAX load reached: load 91.100000/80.000000 at i=1
[01-03-2019 01:29:40] NPCD: WARN: MAX load reached: load 95.060000/80.000000 at i=0
[01-03-2019 01:29:52] NPCD: WARN: MAX load reached: load 125.580000/80.000000 at i=1
[01-03-2019 01:30:04] NPCD: WARN: MAX load reached: load 147.990000/80.000000 at i=1
[01-03-2019 01:30:16] NPCD: WARN: MAX load reached: load 162.670000/80.000000 at i=1
[01-03-2019 01:30:28] NPCD: WARN: MAX load reached: load 183.580000/80.000000 at i=1
[01-03-2019 01:30:40] NPCD: WARN: MAX load reached: load 192.930000/80.000000 at i=1
[01-03-2019 01:30:52] NPCD: WARN: MAX load reached: load 191.460000/80.000000 at i=1
[01-03-2019 01:31:04] NPCD: WARN: MAX load reached: load 204.210000/80.000000 at i=1
[01-03-2019 01:31:16] NPCD: WARN: MAX load reached: load 211.290000/80.000000 at i=1
[01-03-2019 01:31:28] NPCD: WARN: MAX load reached: load 197.990000/80.000000 at i=1
[01-03-2019 01:31:40] NPCD: WARN: MAX load reached: load 177.180000/80.000000 at i=1
[01-03-2019 01:31:52] NPCD: WARN: MAX load reached: load 143.340000/80.000000 at i=1
[01-03-2019 01:32:04] NPCD: WARN: MAX load reached: load 129.030000/80.000000 at i=1
[01-03-2019 01:32:16] NPCD: WARN: MAX load reached: load 110.880000/80.000000 at i=1
[01-03-2019 01:32:28] NPCD: WARN: MAX load reached: load 89.070000/80.000000 at i=1
[01-07-2019 09:19:47] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 09:22:20] NPCD: npcd Daemon (0.4.14) started with PID=40490
[01-07-2019 09:22:20] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 09:22:20] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 17:46:39] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 17:47:12] NPCD: npcd Daemon (0.4.14) started with PID=43472
[01-07-2019 17:47:12] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 17:47:12] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:05:16] NPCD: npcd Daemon (0.4.14) started with PID=2079
[01-07-2019 18:05:16] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:05:16] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:18:40] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:25:04] NPCD: npcd Daemon (0.4.14) started with PID=15756
[01-07-2019 18:25:04] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:25:04] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:38:37] NPCD: npcd Daemon (0.4.14) started with PID=36520
[01-07-2019 18:38:37] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:38:37] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:49:33] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:50:09] NPCD: npcd Daemon (0.4.14) started with PID=29586
[01-07-2019 18:50:09] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:50:09] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 19:14:07] NPCD: npcd Daemon (0.4.14) started with PID=36536
[01-07-2019 19:14:07] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 19:14:07] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-08-2019 09:20:37] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-08-2019 09:21:02] NPCD: npcd Daemon (0.4.14) started with PID=57193
[01-08-2019 09:21:02] NPCD: Please have a look at 'npcd -V' to get license information
[01-08-2019 09:21:02] NPCD: HINT: load_threshold is enabled - ('80.000000')
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/xidpe | wc -l
0
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/perfdata/ | wc -l
2
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/checkresults/ | wc -l
0
[root@att1-nag1 mlilek]# tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.log
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No sLOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
[root@att1-nag1 mlilek]#

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 4:29 pm
by tgriep
Hey Matt, things have been going well, just busy as usual. How have you been?

Thanks for the output.
I think the settings for the ramdisk may have been changed somehow so I will need to see there files from the Nagios server.

Code: Select all

/usr/local/nagios/etc/commands.cfg
/usr/local/nagios/etc/nagios.cfg
In this KB articke
https://support.nagios.com/kb/article/n ... ce-25.html
Under this section
The postgresql service is not running or the database is not accepting commands
It talks about vacuuming the postgres database, do that on the server.

After that, check the /usr/local/nagiosxi/var/dbmaint.log file to see if the dbmaint process ran.
Wait for 15 minutes before checking it.

If not, run these commands and post the output here.

Code: Select all

free
tail -50 /usr/local/nagiosxi/var/dbmaint.log
echo "SELECT relname AS objectname, relkind AS objecttype, reltuples, pg_size_pretty(relpages::bigint*8*1024) AS size FROM pg_class WHERE relpages >= 8 ORDER BY relpages DESC;" | psql nagiosxi nagiosxi

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 4:59 pm
by matt.lilek
Hey Tom,

Before i get into the RAM Disk Issue, the Database Backend, database maintenance, event manager and system statistics are not running again. Please advise.

Code: Select all

tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.logps -ef --cols=300
tail: option used in invalid context -- 5
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/perfdata.log
2016-11-20 06:35:01 [8418] [0] *** Timeout while processing Host: "msa-meta.am.int.amec.com" Service: "Ping"
2016-11-20 06:35:01 [8418] [0] *** process_perfdata.pl terminated on signal ALRM
2017-01-13 18:26:27 [54467] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Please check your npcd.cfg
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1484349842.perfdata.service-PID-54467 deleted
2017-01-13 18:26:28 [54467] [0] *** Timeout while processing Host: "npl-vpn-edge.global.amec.com" Service: "Ping"
2017-01-13 18:26:28 [54467] [0] *** process_perfdata.pl terminated on signal ALRM
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Please check your npcd.cfg
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1499439947.perfdata.service-PID-48730 deleted
2017-07-07 11:07:24 [48730] [0] *** Timeout while processing Host: "att1-vw-as5.am.int.amec.com" Service: "Ping"
2017-07-07 11:07:24 [48730] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150021.perfdata.host-PID-43579 deleted
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150022.perfdata.service-PID-43580 deleted
2018-11-13 18:02:06 [43580] [0] *** Timeout while processing Host: "cin-cue.global.amec.com" Service: "Ping"
2018-11-13 18:02:06 [43579] [0] *** Timeout while processing Host: "mty-gdc0.global.amec.com" Service: "_HOST_"
2018-11-13 18:02:06 [43580] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150036.perfdata.host-PID-44396 deleted
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** Timeout while processing Host: "edc01-ora10p1.global.amec.com" Service: "_HOST_"
2018-11-13 18:03:39 [44396] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150037.perfdata.service-PID-44397 deleted
2018-11-13 18:03:39 [44397] [0] *** Timeout while processing Host: "edm-plt1.global.amec.com" Service: "Ping"
2018-11-13 18:03:39 [44397] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.host-PID-23960 deleted
2018-11-15 08:16:48 [23960] [0] *** Timeout while processing Host: "sas1-sw-core1.global.amec.com" Service: "_HOST_"
2018-11-15 08:16:48 [23960] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.service-PID-23961 deleted
2018-11-15 08:16:50 [23961] [0] *** Timeout while processing Host: "cal-sw-10a.global.amec.com" Service: "Ping"
2018-11-15 08:16:50 [23961] [0] *** process_perfdata.pl terminated on signal ALRM
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/npcd.log
[01-03-2019 00:45:58] NPCD: WARN: MAX load reached: load 200.140000/80.000000 at i=1
[01-03-2019 00:46:10] NPCD: WARN: MAX load reached: load 177.340000/80.000000 at i=1
[01-03-2019 00:46:22] NPCD: WARN: MAX load reached: load 153.090000/80.000000 at i=1
[01-03-2019 00:46:34] NPCD: WARN: MAX load reached: load 133.030000/80.000000 at i=1
[01-03-2019 00:46:46] NPCD: WARN: MAX load reached: load 105.550000/80.000000 at i=1
[01-03-2019 00:46:58] NPCD: WARN: MAX load reached: load 91.100000/80.000000 at i=1
[01-03-2019 01:29:40] NPCD: WARN: MAX load reached: load 95.060000/80.000000 at i=0
[01-03-2019 01:29:52] NPCD: WARN: MAX load reached: load 125.580000/80.000000 at i=1
[01-03-2019 01:30:04] NPCD: WARN: MAX load reached: load 147.990000/80.000000 at i=1
[01-03-2019 01:30:16] NPCD: WARN: MAX load reached: load 162.670000/80.000000 at i=1
[01-03-2019 01:30:28] NPCD: WARN: MAX load reached: load 183.580000/80.000000 at i=1
[01-03-2019 01:30:40] NPCD: WARN: MAX load reached: load 192.930000/80.000000 at i=1
[01-03-2019 01:30:52] NPCD: WARN: MAX load reached: load 191.460000/80.000000 at i=1
[01-03-2019 01:31:04] NPCD: WARN: MAX load reached: load 204.210000/80.000000 at i=1
[01-03-2019 01:31:16] NPCD: WARN: MAX load reached: load 211.290000/80.000000 at i=1
[01-03-2019 01:31:28] NPCD: WARN: MAX load reached: load 197.990000/80.000000 at i=1
[01-03-2019 01:31:40] NPCD: WARN: MAX load reached: load 177.180000/80.000000 at i=1
[01-03-2019 01:31:52] NPCD: WARN: MAX load reached: load 143.340000/80.000000 at i=1
[01-03-2019 01:32:04] NPCD: WARN: MAX load reached: load 129.030000/80.000000 at i=1
[01-03-2019 01:32:16] NPCD: WARN: MAX load reached: load 110.880000/80.000000 at i=1
[01-03-2019 01:32:28] NPCD: WARN: MAX load reached: load 89.070000/80.000000 at i=1
[01-07-2019 09:19:47] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 09:22:20] NPCD: npcd Daemon (0.4.14) started with PID=40490
[01-07-2019 09:22:20] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 09:22:20] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 17:46:39] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 17:47:12] NPCD: npcd Daemon (0.4.14) started with PID=43472
[01-07-2019 17:47:12] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 17:47:12] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:05:16] NPCD: npcd Daemon (0.4.14) started with PID=2079
[01-07-2019 18:05:16] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:05:16] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:18:40] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:25:04] NPCD: npcd Daemon (0.4.14) started with PID=15756
[01-07-2019 18:25:04] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:25:04] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:38:37] NPCD: npcd Daemon (0.4.14) started with PID=36520
[01-07-2019 18:38:37] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:38:37] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:49:33] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:50:09] NPCD: npcd Daemon (0.4.14) started with PID=29586
[01-07-2019 18:50:09] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:50:09] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 19:14:07] NPCD: npcd Daemon (0.4.14) started with PID=36536
[01-07-2019 19:14:07] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 19:14:07] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-08-2019 09:20:37] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-08-2019 09:21:02] NPCD: npcd Daemon (0.4.14) started with PID=57193
[01-08-2019 09:21:02] NPCD: Please have a look at 'npcd -V' to get license information
[01-08-2019 09:21:02] NPCD: HINT: load_threshold is enabled - ('80.000000')
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/xidpe | wc -l
0
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/perfdata/ | wc -l
2
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/checkresults/ | wc -l
0
[root@att1-nag1 mlilek]# tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.log
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No sLOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG:  could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
[root@att1-nag1 mlilek]# tail -50 /var/lib/pgsql/pgstartup.log
could not write to log file: No space left on device
FATAL:  could not write lock file "postmaster.pid": No space left on device
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 63382) running in data directory "/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 63382) running in data directory "/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 65438) running in data directory "/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 1683) running in data directory "/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 1683) running in data directory "/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 1678) running in data directory "/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 47220) running in data directory "/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 64328) running in data directory "/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 64328) running in data directory "/var/lib/pgsql/data"?
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to lo[root@att1-nag1 mlilek]#

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 5:10 pm
by matt.lilek
Quickly noticed i was out of space all of a sudden, didnt expect that. i ran echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | psql nagiosxi nagiosxi and it cleared up 27GB. Got the backend started again and waiting for the Database Maintenance to run. If there is something else i should be looking at please let me know.

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 5:10 pm
by tgriep
Do the vacuuming of the postgress database to clean it up and restart it and then run the commands that I posted earlier.
The out of space message may be caused by tables that need to be cleaned up in the postgress database and it may help fix that.

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 5:16 pm
by matt.lilek
what is the command to Vacuum again? Used to be in my email before i was terminated. Oh and by truncating it has also brought memory usage down from 14GB to 4GB and subsequently fixed the ramdisk issue as well

Re: Database Repair has no effect

Posted: Tue Jan 08, 2019 5:25 pm
by tgriep
The commands are In this KB articke
https://support.nagios.com/kb/article/n ... ce-25.html
Under this section
The postgresql service is not running or the database is not accepting commands
Depending in the Postgres version, the commands are slightly different.

Re: Database Repair has no effect

Posted: Wed Jan 09, 2019 9:42 am
by matt.lilek
Hey Tom, sorry to have you post that reply. I didnt even read what you wrote above until after i posted the question. I should have wrote that back telling you i am able to read for myself. Thanks for the spoon feed though. Sometimes it is needed just not this time (on this part). Anyway.... Looks like i am back up and running since shortly after your last post. Think you can wrap this up.