Page 2 of 3
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 12:58 pm
by matt.lilek
Well Scott, tried logging in again and have been able to login. All appears to be running as normal however it seems to firing a whole bunch of erroneous emails. I think all of this kinda started with a major outage so it might be still catching up with all the down hosts and services. Can we just purge all those and start out fresh on the alerts. Please let me know what i need to do. Thank you.
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 3:07 pm
by tgriep
Your server is using the Postgress database, so you would run this to truncate the SQL tables to clear out the data while the services are not running.
Run them all as root.
Code: Select all
service crond stop
service nagios stop
service ndo2db stop
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | psql nagiosxi nagiosxi
service ndo2db start
service nagios start
service crond start
Your previous post shows the /var/nagiosramdisk is full and we need to look at that.
Can you run the following commands as root and post the output?
Code: Select all
ps -ef --cols=300
tail -50 /usr/local/nagios/var/perfdata.log
tail -50 /usr/local/nagios/var/npcd.log
ls /var/nagiosramdisk/spool/xidpe | wc -l
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.log
Thanks
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 4:00 pm
by matt.lilek
Hello Tom,
Happy New Year! How have you been keeping? Was gonna ask about the RAMDISK but glad you brought it up.
Code: Select all
[root@att1-nag1 mlilek]# ps -ef --cols=300
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Jan07 ? 00:00:01 /sbin/init
root 2 0 0 Jan07 ? 00:00:00 [kthreadd]
root 3 2 0 Jan07 ? 00:00:10 [migration/0]
root 4 2 0 Jan07 ? 00:00:02 [ksoftirqd/0]
root 5 2 0 Jan07 ? 00:00:00 [stopper/0]
root 6 2 0 Jan07 ? 00:00:00 [watchdog/0]
root 7 2 0 Jan07 ? 00:00:09 [migration/1]
root 8 2 0 Jan07 ? 00:00:00 [stopper/1]
root 9 2 0 Jan07 ? 00:00:01 [ksoftirqd/1]
root 10 2 0 Jan07 ? 00:00:00 [watchdog/1]
root 11 2 0 Jan07 ? 00:00:13 [migration/2]
root 12 2 0 Jan07 ? 00:00:00 [stopper/2]
root 13 2 0 Jan07 ? 00:00:00 [ksoftirqd/2]
root 14 2 0 Jan07 ? 00:00:00 [watchdog/2]
root 15 2 0 Jan07 ? 00:00:13 [migration/3]
root 16 2 0 Jan07 ? 00:00:00 [stopper/3]
root 17 2 0 Jan07 ? 00:00:00 [ksoftirqd/3]
root 18 2 0 Jan07 ? 00:00:00 [watchdog/3]
root 19 2 0 Jan07 ? 00:00:12 [migration/4]
root 20 2 0 Jan07 ? 00:00:00 [stopper/4]
root 21 2 0 Jan07 ? 00:00:00 [ksoftirqd/4]
root 22 2 0 Jan07 ? 00:00:00 [watchdog/4]
root 23 2 0 Jan07 ? 00:00:12 [migration/5]
root 24 2 0 Jan07 ? 00:00:00 [stopper/5]
root 25 2 0 Jan07 ? 00:00:00 [ksoftirqd/5]
root 26 2 0 Jan07 ? 00:00:00 [watchdog/5]
root 27 2 0 Jan07 ? 00:00:32 [events/0]
root 28 2 0 Jan07 ? 00:00:02 [events/1]
root 29 2 0 Jan07 ? 00:00:02 [events/2]
root 30 2 0 Jan07 ? 00:00:02 [events/3]
root 31 2 0 Jan07 ? 00:00:03 [events/4]
root 32 2 0 Jan07 ? 00:00:05 [events/5]
root 33 2 0 Jan07 ? 00:00:00 [events/0]
ls /var/nagiosramdisk/spool/xidpe | wc -l
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
tail -50 /root 34 2 0 Jan07 ? 00:00:00 [events/1]
var/lib/pgsql/data/pg_log/postgresql-Tue.logroot 35 2 0 Jan07 ? 00:00:00 [events/2]
root 36 2 0 Jan07 ? 00:00:00 [events/3]
root 37 2 0 Jan07 ? 00:00:00 [events/4]
root 38 2 0 Jan07 ? 00:00:00 [events/5]
root 39 2 0 Jan07 ? 00:00:00 [events_long/0]
root 40 2 0 Jan07 ? 00:00:00 [events_long/1]
root 41 2 0 Jan07 ? 00:00:00 [events_long/2]
root 42 2 0 Jan07 ? 00:00:00 [events_long/3]
root 43 2 0 Jan07 ? 00:00:00 [events_long/4]
root 44 2 0 Jan07 ? 00:00:00 [events_long/5]
root 45 2 0 Jan07 ? 00:00:00 [events_power_ef]
root 46 2 0 Jan07 ? 00:00:00 [events_power_ef]
root 47 2 0 Jan07 ? 00:00:00 [events_power_ef]
root 48 2 0 Jan07 ? 00:00:00 [events_power_ef]
root 49 2 0 Jan07 ? 00:00:00 [events_power_ef]
root 50 2 0 Jan07 ? 00:00:00 [events_power_ef]
root 51 2 0 Jan07 ? 00:00:00 [cgroup]
root 52 2 0 Jan07 ? 00:00:00 [khelper]
root 53 2 0 Jan07 ? 00:00:00 [netns]
root 54 2 0 Jan07 ? 00:00:00 [async/mgr]
root 55 2 0 Jan07 ? 00:00:00 [pm]
root 56 2 0 Jan07 ? 00:00:00 [sync_supers]
root 57 2 0 Jan07 ? 00:00:00 [bdi-default]
root 58 2 0 Jan07 ? 00:00:00 [kintegrityd/0]
root 59 2 0 Jan07 ? 00:00:00 [kintegrityd/1]
root 60 2 0 Jan07 ? 00:00:00 [kintegrityd/2]
root 61 2 0 Jan07 ? 00:00:00 [kintegrityd/3]
root 62 2 0 Jan07 ? 00:00:00 [kintegrityd/4]
root 63 2 0 Jan07 ? 00:00:00 [kintegrityd/5]
root 64 2 0 Jan07 ? 00:00:05 [kblockd/0]
root 65 2 0 Jan07 ? 00:00:10 [kblockd/1]
root 66 2 0 Jan07 ? 00:00:10 [kblockd/2]
root 67 2 0 Jan07 ? 00:00:10 [kblockd/3]
root 68 2 0 Jan07 ? 00:00:10 [kblockd/4]
root 69 2 0 Jan07 ? 00:00:10 [kblockd/5]
root 70 2 0 Jan07 ? 00:00:00 [kacpid]
root 71 2 0 Jan07 ? 00:00:00 [kacpi_notify]
root 72 2 0 Jan07 ? 00:00:00 [kacpi_hotplug]
root 73 2 0 Jan07 ? 00:00:00 [ata_aux]
root 74 2 0 Jan07 ? 00:00:00 [ata_sff/0]
root 75 2 0 Jan07 ? 00:00:00 [ata_sff/1]
root 76 2 0 Jan07 ? 00:00:00 [ata_sff/2]
root 77 2 0 Jan07 ? 00:00:00 [ata_sff/3]
root 78 2 0 Jan07 ? 00:00:00 [ata_sff/4]
root 79 2 0 Jan07 ? 00:00:00 [ata_sff/5]
root 80 2 0 Jan07 ? 00:00:00 [ksuspend_usbd]
root 81 2 0 Jan07 ? 00:00:00 [khubd]
root 82 2 0 Jan07 ? 00:00:00 [kseriod]
root 83 2 0 Jan07 ? 00:00:00 [md/0]
root 84 2 0 Jan07 ? 00:00:00 [md/1]
root 85 2 0 Jan07 ? 00:00:00 [md/2]
root 86 2 0 Jan07 ? 00:00:00 [md/3]
root 87 2 0 Jan07 ? 00:00:00 [md/4]
root 88 2 0 Jan07 ? 00:00:00 [md/5]
root 89 2 0 Jan07 ? 00:00:00 [md_misc/0]
root 90 2 0 Jan07 ? 00:00:00 [md_misc/1]
root 91 2 0 Jan07 ? 00:00:00 [md_misc/2]
root 92 2 0 Jan07 ? 00:00:00 [md_misc/3]
root 93 2 0 Jan07 ? 00:00:00 [md_misc/4]
root 94 2 0 Jan07 ? 00:00:00 [md_misc/5]
root 95 2 0 Jan07 ? 00:00:00 [linkwatch]
root 98 2 0 Jan07 ? 00:00:00 [khungtaskd]
root 99 2 0 Jan07 ? 00:00:00 [lru-add-drain/0]
root 100 2 0 Jan07 ? 00:00:00 [lru-add-drain/1]
root 101 2 0 Jan07 ? 00:00:00 [lru-add-drain/2]
root 102 2 0 Jan07 ? 00:00:00 [lru-add-drain/3]
root 103 2 0 Jan07 ? 00:00:00 [lru-add-drain/4]
root 104 2 0 Jan07 ? 00:00:00 [lru-add-drain/5]
root 105 2 0 Jan07 ? 00:01:27 [kswapd0]
root 106 2 0 Jan07 ? 00:00:00 [ksmd]
root 107 2 0 Jan07 ? 00:00:52 [khugepaged]
root 108 2 0 Jan07 ? 00:00:00 [aio/0]
root 109 2 0 Jan07 ? 00:00:00 [aio/1]
root 110 2 0 Jan07 ? 00:00:00 [aio/2]
root 111 2 0 Jan07 ? 00:00:00 [aio/3]
root 112 2 0 Jan07 ? 00:00:00 [aio/4]
root 113 2 0 Jan07 ? 00:00:00 [aio/5]
root 114 2 0 Jan07 ? 00:00:00 [crypto/0]
root 115 2 0 Jan07 ? 00:00:00 [crypto/1]
root 116 2 0 Jan07 ? 00:00:00 [crypto/2]
root 117 2 0 Jan07 ? 00:00:00 [crypto/3]
root 118 2 0 Jan07 ? 00:00:00 [crypto/4]
root 119 2 0 Jan07 ? 00:00:00 [crypto/5]
root 126 2 0 Jan07 ? 00:00:00 [kthrotld/0]
root 127 2 0 Jan07 ? 00:00:00 [kthrotld/1]
root 128 2 0 Jan07 ? 00:00:00 [kthrotld/2]
root 129 2 0 Jan07 ? 00:00:00 [kthrotld/3]
root 130 2 0 Jan07 ? 00:00:00 [kthrotld/4]
root 131 2 0 Jan07 ? 00:00:00 [kthrotld/5]
root 132 2 0 Jan07 ? 00:00:00 [pciehpd]
root 134 2 0 Jan07 ? 00:00:00 [kpsmoused]
root 135 2 0 Jan07 ? 00:00:00 [usbhid_resumer]
root 136 2 0 Jan07 ? 00:00:00 [deferwq]
root 168 2 0 Jan07 ? 00:00:00 [kdmremove]
root 169 2 0 Jan07 ? 00:00:00 [kstriped]
root 198 2 0 Jan07 ? 00:00:00 [ttm_swap]
root 397 2 0 Jan07 ? 00:00:00 [scsi_eh_0]
root 398 2 0 Jan07 ? 00:00:00 [scsi_eh_1]
root 403 2 0 Jan07 ? 00:00:01 [mpt_poll_0]
root 404 2 0 Jan07 ? 00:00:00 [mpt/0]
root 405 2 0 Jan07 ? 00:00:00 [scsi_eh_2]
root 461 2 0 Jan07 ? 00:00:00 [kdmflush]
root 462 2 0 Jan07 ? 00:00:00 [kdmflush]
root 481 2 0 Jan07 ? 00:01:05 [jbd2/dm-0-8]
root 482 2 0 Jan07 ? 00:00:00 [ext4-dio-unwrit]
root 559 1 0 Jan07 ? 00:00:00 /sbin/udevd -d
root 726 2 0 Jan07 ? 00:00:00 [vmmemctl]
root 883 559 0 Jan07 ? 00:00:00 /sbin/udevd -d
root 891 559 0 Jan07 ? 00:00:00 /sbin/udevd -d
root 915 2 0 Jan07 ? 00:00:00 [jbd2/sda1-8]
root 916 2 0 Jan07 ? 00:00:00 [ext4-dio-unwrit]
root 985 2 0 Jan07 ? 00:00:01 [kauditd]
root 1193 2 0 Jan07 ? 00:02:53 [flush-253:0]
root 1603 1 0 Jan07 ? 00:00:02 auditd
root 1625 1 0 Jan07 ? 00:00:11 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
dbus 1640 1 0 Jan07 ? 00:00:00 dbus-daemon --system
root 1684 1 0 Jan07 ? 00:00:01 /usr/sbin/snmptrapd -Ln -p /var/run/snmptrapd.pid
root 1696 1 0 Jan07 ? 00:00:00 /usr/bin/perl /usr/sbin/snmptt --daemon
snmptt 1697 1696 0 Jan07 ? 00:00:01 /usr/bin/perl /usr/sbin/snmptt --daemon
root 1714 1 0 Jan07 ? 00:00:00 /usr/sbin/sshd
root 1725 1 0 Jan07 ? 00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
ntp 1753 1 0 Jan07 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root 2507 1714 0 15:57 ? 00:00:00 sshd: mlilek [priv]
postgres 3553 1 0 11:15 ? 00:00:05 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
postgres 3627 3553 0 11:15 ? 00:00:02 postgres: logger process
postgres 3637 3553 0 11:15 ? 00:00:46 postgres: writer process
postgres 3638 3553 0 11:15 ? 00:00:02 postgres: wal writer process
postgres 3639 3553 0 11:15 ? 00:00:01 postgres: autovacuum launcher process
postgres 3640 3553 0 11:15 ? 00:00:10 postgres: stats collector process
postgres 3677 3553 0 11:15 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(34454) idle
mlilek 3957 2507 0 15:57 ? 00:00:00 sshd: mlilek@pts/0
mlilek 4052 3957 0 15:57 pts/0 00:00:00 -bash
root 4209 4052 0 15:57 pts/0 00:00:00 su root
postgres 4592 3553 0 11:16 ? 00:00:07 postgres: nagiosxi nagiosxi ::1(34520) idle
root 5789 4209 0 15:57 pts/0 00:00:00 bash
root 6448 1 0 15:58 ? 00:00:00 CROND
root 6449 1 0 15:58 ? 00:00:00 CROND
root 6452 1 0 15:58 ? 00:00:00 CROND
nagios 6457 6448 0 15:58 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios 6460 6449 0 15:58 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios 6464 6457 0 15:58 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 6467 6460 0 15:58 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
nagios 6472 6452 0 15:58 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios 6474 6472 0 15:58 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
postgres 6484 3553 0 15:58 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(58382) idle
postgres 6486 3553 0 15:58 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(58388) idle
postgres 6544 3553 0 15:58 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(58430) idle
nagios 7519 1 0 15:58 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 7557 1 15 15:58 ? 00:00:02 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 7558 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7559 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7560 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7561 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7562 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7563 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7564 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7565 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7566 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 7567 7519 2 15:58 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 7568 7567 33 15:58 ? 00:00:05 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 7646 7557 0 15:58 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root 8073 1 0 15:58 ? 00:00:00 crond
nagios 8236 7560 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H fmm-its1.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8304 7558 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.30.1.95 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8305 7560 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.30.5.127 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8306 7559 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.30.5.19 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8307 7561 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.35.254.49 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8308 7562 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.30.5.93 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8312 7564 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.41.57.38 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8316 7561 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.58.105.5 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8333 7559 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H tpa1-as1.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8334 7561 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H tpa1-vh2-ilo.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8335 7562 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H tpa1-its1.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8437 7566 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H sfe-sw-core1.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8496 7559 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H chi2-fw1.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8745 7564 3 15:58 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_snmp_storage.pl -H vdc1-cwprn1.global.amec.com -C 321AmEc! --v2c -m Physical Memory -w 90 -c 95 -f
nagios 8849 7561 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H gvl-vh3-ilo.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8886 7566 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.31.209.8 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 8940 7559 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H mtl-ipt-vh1-cimc.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 9087 7565 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H brl-vh3-ilo.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 9169 7559 5 15:58 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_snmp_storage.pl -H sas-sp3.am.int.amec.com -C 321AmEc! --v2c -m Physical Memory -w 95 -c 98 -f
nagios 9242 7561 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H nhl-vw-gis1.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 9254 7565 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.14.25.254 -w 3000.0 80 -c 5000.0 100 -p 5
nagios 9258 7559 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H sgo-vrep1.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
nagios 9264 7565 0 15:58 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H van-as8.global.amec.com -w 3000.0 80 -c 5000.0 100 -p 5
root 9281 5789 0 15:58 pts/0 00:00:00 ps -ef --cols=300
postgres 9453 3553 0 11:16 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(34554) idle
postgres 11135 3553 0 11:16 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(34560) idle
postgres 14492 3553 0 11:16 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(34612) idle
postgres 15164 3553 0 11:16 ? 00:00:07 postgres: nagiosxi nagiosxi ::1(34644) idle
postgres 15270 3553 0 11:16 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(34652) idle
postgres 16424 3553 0 11:16 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(34692) idle
postgres 16813 3553 0 11:16 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(34720) idle
postgres 18457 3553 0 11:17 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(34858) idle
postgres 29336 3553 0 11:19 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(35240) idle
postfix 32802 35412 0 15:10 ? 00:00:00 pickup -l -t fifo -u
nagios 35279 1 0 Jan07 ? 00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
root 35412 1 0 Jan07 ? 00:00:03 /usr/libexec/postfix/master
postfix 35427 35412 0 Jan07 ? 00:00:00 qmgr -l -t fifo -u
496 35432 1 0 Jan07 ? 00:00:00 shellinaboxd -u shellinabox -g shellinabox --cert=/var/lib/shellinabox --port=7878 --background=/var/run/shellinaboxd.pid --disable-ssl-menu -s /:SSH --localhost-only --css white-on-black.css
496 35433 35432 0 Jan07 ? 00:00:00 shellinaboxd -u shellinabox -g shellinabox --cert=/var/lib/shellinabox --port=7878 --background=/var/run/shellinaboxd.pid --disable-ssl-menu -s /:SSH --localhost-only --css white-on-black.css
nagios 36567 1 0 Jan07 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
root 37889 1 0 Jan07 tty1 00:00:00 /sbin/mingetty /dev/tty1
root 37891 1 0 Jan07 tty2 00:00:00 /sbin/mingetty /dev/tty2
root 37893 1 0 Jan07 tty3 00:00:00 /sbin/mingetty /dev/tty3
root 37895 1 0 Jan07 tty4 00:00:00 /sbin/mingetty /dev/tty4
root 37897 1 0 Jan07 tty5 00:00:00 /sbin/mingetty /dev/tty5
root 37899 1 0 Jan07 tty6 00:00:00 /sbin/mingetty /dev/tty6
apache 40298 57225 0 12:49 ? 00:01:40 /usr/sbin/httpd
apache 40901 57225 0 12:49 ? 00:01:40 /usr/sbin/httpd
apache 41261 57225 0 12:49 ? 00:01:38 /usr/sbin/httpd
postgres 41264 3553 0 12:49 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(38586) idle
postgres 41345 3553 0 12:49 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(38592) idle
postgres 41350 3553 0 12:49 ? 00:00:05 postgres: nagiosxi nagiosxi ::1(38598) idle
apache 47878 57225 0 11:33 ? 00:01:45 /usr/sbin/httpd
postgres 49553 3553 0 11:33 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(38902) idle
root 56411 1 0 15:55 ? 00:00:00 CROND
nagios 56423 56411 0 15:55 ? 00:00:02 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
nagios 56812 56423 0 15:55 ? 00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
nagios 56813 56423 0 15:55 ? 00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
nagios 56814 56423 0 15:55 ? 00:00:01 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
nagios 56816 56423 0 15:55 ? 00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok --user=nagios --group=nagios
root 57036 1 0 09:21 ? 00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysql
mysql 57141 57036 0 09:21 ? 00:00:05 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
nagios 57193 1 0 09:21 ? 00:00:05 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
root 57225 1 0 09:21 ? 00:00:00 /usr/sbin/httpd
apache 57227 57225 0 09:21 ? 00:01:59 /usr/sbin/httpd
apache 57228 57225 0 09:21 ? 00:01:54 /usr/sbin/httpd
apache 57229 57225 0 09:21 ? 00:01:57 /usr/sbin/httpd
apache 57230 57225 0 09:21 ? 00:01:54 /usr/sbin/httpd
apache 57231 57225 0 09:21 ? 00:01:54 /usr/sbin/httpd
apache 57232 57225 0 09:21 ? 00:02:00 /usr/sbin/httpd
apache 57233 57225 0 09:21 ? 00:02:00 /usr/sbin/httpd
apache 57234 57225 0 09:21 ? 00:01:57 /usr/sbin/httpd
root 57339 56411 0 15:55 ? 00:00:00 /usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t -f root
root 57340 57339 0 15:55 ? 00:00:00 /usr/sbin/postdrop -r
apache 58276 57225 0 09:21 ? 00:02:00 /usr/sbin/httpd
apache 58899 57225 0 09:21 ? 00:02:00 /usr/sbin/httpd
apache 64853 57225 0 09:36 ? 00:02:00 /usr/sbin/httpd
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/perfdata.log
2016-11-20 06:35:01 [8418] [0] *** Timeout while processing Host: "msa-meta.am.int.amec.com" Service: "Ping"
2016-11-20 06:35:01 [8418] [0] *** process_perfdata.pl terminated on signal ALRM
2017-01-13 18:26:27 [54467] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Please check your npcd.cfg
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1484349842.perfdata.service-PID-54467 deleted
2017-01-13 18:26:28 [54467] [0] *** Timeout while processing Host: "npl-vpn-edge.global.amec.com" Service: "Ping"
2017-01-13 18:26:28 [54467] [0] *** process_perfdata.pl terminated on signal ALRM
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Please check your npcd.cfg
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1499439947.perfdata.service-PID-48730 deleted
2017-07-07 11:07:24 [48730] [0] *** Timeout while processing Host: "att1-vw-as5.am.int.amec.com" Service: "Ping"
2017-07-07 11:07:24 [48730] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150021.perfdata.host-PID-43579 deleted
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150022.perfdata.service-PID-43580 deleted
2018-11-13 18:02:06 [43580] [0] *** Timeout while processing Host: "cin-cue.global.amec.com" Service: "Ping"
2018-11-13 18:02:06 [43579] [0] *** Timeout while processing Host: "mty-gdc0.global.amec.com" Service: "_HOST_"
2018-11-13 18:02:06 [43580] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150036.perfdata.host-PID-44396 deleted
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** Timeout while processing Host: "edc01-ora10p1.global.amec.com" Service: "_HOST_"
2018-11-13 18:03:39 [44396] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Deleting cups -ef --cols=300
tail -50 /usr/local/nagios/var/perfdata.log
tail -50 /usr/local/nagios/var/npcd.log
ls /var/nagiosramdisk/spool/xidpe | wc -l
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.logrrent file to avoid NPCD loops
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150037.perfdata.service-PID-44397 deleted
2018-11-13 18:03:39 [44397] [0] *** Timeout while processing Host: "edm-plt1.global.amec.com" Service: "Ping"
2018-11-13 18:03:39 [44397] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.host-PID-23960 deleted
2018-11-15 08:16:48 [23960] [0] *** Timeout while processing Host: "sas1-sw-core1.global.amec.com" Service: "_HOST_"
2018-11-15 08:16:48 [23960] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.service-PID-23961 deleted
2018-11-15 08:16:50 [23961] [0] *** Timeout while processing Host: "cal-sw-10a.global.amec.com" Service: "Ping"
2018-11-15 08:16:50 [23961] [0] *** process_perfdata.pl terminated on signal ALRM
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/npcd.log
[01-03-2019 00:45:58] NPCD: WARN: MAX load reached: load 200.140000/80.000000 at i=1
[01-03-2019 00:46:10] NPCD: WARN: MAX load reached: load 177.340000/80.000000 at i=1
[01-03-2019 00:46:22] NPCD: WARN: MAX load reached: load 153.090000/80.000000 at i=1
[01-03-2019 00:46:34] NPCD: WARN: MAX load reached: load 133.030000/80.000000 at i=1
[01-03-2019 00:46:46] NPCD: WARN: MAX load reached: load 105.550000/80.000000 at i=1
[01-03-2019 00:46:58] NPCD: WARN: MAX load reached: load 91.100000/80.000000 at i=1
[01-03-2019 01:29:40] NPCD: WARN: MAX load reached: load 95.060000/80.000000 at i=0
[01-03-2019 01:29:52] NPCD: WARN: MAX load reached: load 125.580000/80.000000 at i=1
[01-03-2019 01:30:04] NPCD: WARN: MAX load reached: load 147.990000/80.000000 at i=1
[01-03-2019 01:30:16] NPCD: WARN: MAX load reached: load 162.670000/80.000000 at i=1
[01-03-2019 01:30:28] NPCD: WARN: MAX load reached: load 183.580000/80.000000 at i=1
[01-03-2019 01:30:40] NPCD: WARN: MAX load reached: load 192.930000/80.000000 at i=1
[01-03-2019 01:30:52] NPCD: WARN: MAX load reached: load 191.460000/80.000000 at i=1
[01-03-2019 01:31:04] NPCD: WARN: MAX load reached: load 204.210000/80.000000 at i=1
[01-03-2019 01:31:16] NPCD: WARN: MAX load reached: load 211.290000/80.000000 at i=1
[01-03-2019 01:31:28] NPCD: WARN: MAX load reached: load 197.990000/80.000000 at i=1
[01-03-2019 01:31:40] NPCD: WARN: MAX load reached: load 177.180000/80.000000 at i=1
[01-03-2019 01:31:52] NPCD: WARN: MAX load reached: load 143.340000/80.000000 at i=1
[01-03-2019 01:32:04] NPCD: WARN: MAX load reached: load 129.030000/80.000000 at i=1
[01-03-2019 01:32:16] NPCD: WARN: MAX load reached: load 110.880000/80.000000 at i=1
[01-03-2019 01:32:28] NPCD: WARN: MAX load reached: load 89.070000/80.000000 at i=1
[01-07-2019 09:19:47] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 09:22:20] NPCD: npcd Daemon (0.4.14) started with PID=40490
[01-07-2019 09:22:20] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 09:22:20] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 17:46:39] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 17:47:12] NPCD: npcd Daemon (0.4.14) started with PID=43472
[01-07-2019 17:47:12] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 17:47:12] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:05:16] NPCD: npcd Daemon (0.4.14) started with PID=2079
[01-07-2019 18:05:16] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:05:16] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:18:40] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:25:04] NPCD: npcd Daemon (0.4.14) started with PID=15756
[01-07-2019 18:25:04] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:25:04] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:38:37] NPCD: npcd Daemon (0.4.14) started with PID=36520
[01-07-2019 18:38:37] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:38:37] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:49:33] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:50:09] NPCD: npcd Daemon (0.4.14) started with PID=29586
[01-07-2019 18:50:09] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:50:09] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 19:14:07] NPCD: npcd Daemon (0.4.14) started with PID=36536
[01-07-2019 19:14:07] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 19:14:07] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-08-2019 09:20:37] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-08-2019 09:21:02] NPCD: npcd Daemon (0.4.14) started with PID=57193
[01-08-2019 09:21:02] NPCD: Please have a look at 'npcd -V' to get license information
[01-08-2019 09:21:02] NPCD: HINT: load_threshold is enabled - ('80.000000')
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/xidpe | wc -l
0
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/perfdata/ | wc -l
2
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/checkresults/ | wc -l
0
[root@att1-nag1 mlilek]# tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.logps -ef --cols=300
tail: option used in invalid context -- 5
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/perfdata.log
2016-11-20 06:35:01 [8418] [0] *** Timeout while processing Host: "msa-meta.am.int.amec.com" Service: "Ping"
2016-11-20 06:35:01 [8418] [0] *** process_perfdata.pl terminated on signal ALRM
2017-01-13 18:26:27 [54467] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Please check your npcd.cfg
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1484349842.perfdata.service-PID-54467 deleted
2017-01-13 18:26:28 [54467] [0] *** Timeout while processing Host: "npl-vpn-edge.global.amec.com" Service: "Ping"
2017-01-13 18:26:28 [54467] [0] *** process_perfdata.pl terminated on signal ALRM
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Please check your npcd.cfg
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1499439947.perfdata.service-PID-48730 deleted
2017-07-07 11:07:24 [48730] [0] *** Timeout while processing Host: "att1-vw-as5.am.int.amec.com" Service: "Ping"
2017-07-07 11:07:24 [48730] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150021.perfdata.host-PID-43579 deleted
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150022.perfdata.service-PID-43580 deleted
2018-11-13 18:02:06 [43580] [0] *** Timeout while processing Host: "cin-cue.global.amec.com" Service: "Ping"
2018-11-13 18:02:06 [43579] [0] *** Timeout while processing Host: "mty-gdc0.global.amec.com" Service: "_HOST_"
2018-11-13 18:02:06 [43580] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150036.perfdata.host-PID-44396 deleted
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** Timeout while processing Host: "edc01-ora10p1.global.amec.com" Service: "_HOST_"
2018-11-13 18:03:39 [44396] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150037.perfdata.service-PID-44397 deleted
2018-11-13 18:03:39 [44397] [0] *** Timeout while processing Host: "edm-plt1.global.amec.com" Service: "Ping"
2018-11-13 18:03:39 [44397] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.host-PID-23960 deleted
2018-11-15 08:16:48 [23960] [0] *** Timeout while processing Host: "sas1-sw-core1.global.amec.com" Service: "_HOST_"
2018-11-15 08:16:48 [23960] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.service-PID-23961 deleted
2018-11-15 08:16:50 [23961] [0] *** Timeout while processing Host: "cal-sw-10a.global.amec.com" Service: "Ping"
2018-11-15 08:16:50 [23961] [0] *** process_perfdata.pl terminated on signal ALRM
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/npcd.log
[01-03-2019 00:45:58] NPCD: WARN: MAX load reached: load 200.140000/80.000000 at i=1
[01-03-2019 00:46:10] NPCD: WARN: MAX load reached: load 177.340000/80.000000 at i=1
[01-03-2019 00:46:22] NPCD: WARN: MAX load reached: load 153.090000/80.000000 at i=1
[01-03-2019 00:46:34] NPCD: WARN: MAX load reached: load 133.030000/80.000000 at i=1
[01-03-2019 00:46:46] NPCD: WARN: MAX load reached: load 105.550000/80.000000 at i=1
[01-03-2019 00:46:58] NPCD: WARN: MAX load reached: load 91.100000/80.000000 at i=1
[01-03-2019 01:29:40] NPCD: WARN: MAX load reached: load 95.060000/80.000000 at i=0
[01-03-2019 01:29:52] NPCD: WARN: MAX load reached: load 125.580000/80.000000 at i=1
[01-03-2019 01:30:04] NPCD: WARN: MAX load reached: load 147.990000/80.000000 at i=1
[01-03-2019 01:30:16] NPCD: WARN: MAX load reached: load 162.670000/80.000000 at i=1
[01-03-2019 01:30:28] NPCD: WARN: MAX load reached: load 183.580000/80.000000 at i=1
[01-03-2019 01:30:40] NPCD: WARN: MAX load reached: load 192.930000/80.000000 at i=1
[01-03-2019 01:30:52] NPCD: WARN: MAX load reached: load 191.460000/80.000000 at i=1
[01-03-2019 01:31:04] NPCD: WARN: MAX load reached: load 204.210000/80.000000 at i=1
[01-03-2019 01:31:16] NPCD: WARN: MAX load reached: load 211.290000/80.000000 at i=1
[01-03-2019 01:31:28] NPCD: WARN: MAX load reached: load 197.990000/80.000000 at i=1
[01-03-2019 01:31:40] NPCD: WARN: MAX load reached: load 177.180000/80.000000 at i=1
[01-03-2019 01:31:52] NPCD: WARN: MAX load reached: load 143.340000/80.000000 at i=1
[01-03-2019 01:32:04] NPCD: WARN: MAX load reached: load 129.030000/80.000000 at i=1
[01-03-2019 01:32:16] NPCD: WARN: MAX load reached: load 110.880000/80.000000 at i=1
[01-03-2019 01:32:28] NPCD: WARN: MAX load reached: load 89.070000/80.000000 at i=1
[01-07-2019 09:19:47] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 09:22:20] NPCD: npcd Daemon (0.4.14) started with PID=40490
[01-07-2019 09:22:20] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 09:22:20] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 17:46:39] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 17:47:12] NPCD: npcd Daemon (0.4.14) started with PID=43472
[01-07-2019 17:47:12] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 17:47:12] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:05:16] NPCD: npcd Daemon (0.4.14) started with PID=2079
[01-07-2019 18:05:16] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:05:16] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:18:40] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:25:04] NPCD: npcd Daemon (0.4.14) started with PID=15756
[01-07-2019 18:25:04] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:25:04] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:38:37] NPCD: npcd Daemon (0.4.14) started with PID=36520
[01-07-2019 18:38:37] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:38:37] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:49:33] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:50:09] NPCD: npcd Daemon (0.4.14) started with PID=29586
[01-07-2019 18:50:09] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:50:09] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 19:14:07] NPCD: npcd Daemon (0.4.14) started with PID=36536
[01-07-2019 19:14:07] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 19:14:07] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-08-2019 09:20:37] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-08-2019 09:21:02] NPCD: npcd Daemon (0.4.14) started with PID=57193
[01-08-2019 09:21:02] NPCD: Please have a look at 'npcd -V' to get license information
[01-08-2019 09:21:02] NPCD: HINT: load_threshold is enabled - ('80.000000')
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/xidpe | wc -l
0
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/perfdata/ | wc -l
2
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/checkresults/ | wc -l
0
[root@att1-nag1 mlilek]# tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.log
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No sLOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
[root@att1-nag1 mlilek]#
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 4:29 pm
by tgriep
Hey Matt, things have been going well, just busy as usual. How have you been?
Thanks for the output.
I think the settings for the ramdisk may have been changed somehow so I will need to see there files from the Nagios server.
Code: Select all
/usr/local/nagios/etc/commands.cfg
/usr/local/nagios/etc/nagios.cfg
In this KB articke
https://support.nagios.com/kb/article/n ... ce-25.html
Under this section
The postgresql service is not running or the database is not accepting commands
It talks about vacuuming the postgres database, do that on the server.
After that, check the /usr/local/nagiosxi/var/dbmaint.log file to see if the dbmaint process ran.
Wait for 15 minutes before checking it.
If not, run these commands and post the output here.
Code: Select all
free
tail -50 /usr/local/nagiosxi/var/dbmaint.log
echo "SELECT relname AS objectname, relkind AS objecttype, reltuples, pg_size_pretty(relpages::bigint*8*1024) AS size FROM pg_class WHERE relpages >= 8 ORDER BY relpages DESC;" | psql nagiosxi nagiosxi
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 4:59 pm
by matt.lilek
Hey Tom,
Before i get into the RAM Disk Issue, the Database Backend, database maintenance, event manager and system statistics are not running again. Please advise.
Code: Select all
tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.logps -ef --cols=300
tail: option used in invalid context -- 5
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/perfdata.log
2016-11-20 06:35:01 [8418] [0] *** Timeout while processing Host: "msa-meta.am.int.amec.com" Service: "Ping"
2016-11-20 06:35:01 [8418] [0] *** process_perfdata.pl terminated on signal ALRM
2017-01-13 18:26:27 [54467] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: Please check your npcd.cfg
2017-01-13 18:26:28 [54467] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1484349842.perfdata.service-PID-54467 deleted
2017-01-13 18:26:28 [54467] [0] *** Timeout while processing Host: "npl-vpn-edge.global.amec.com" Service: "Ping"
2017-01-13 18:26:28 [54467] [0] *** process_perfdata.pl terminated on signal ALRM
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Timeout after 80 secs. ***
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: Please check your npcd.cfg
2017-07-07 11:07:24 [48730] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1499439947.perfdata.service-PID-48730 deleted
2017-07-07 11:07:24 [48730] [0] *** Timeout while processing Host: "att1-vw-as5.am.int.amec.com" Service: "Ping"
2017-07-07 11:07:24 [48730] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:02:06 [43579] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150021.perfdata.host-PID-43579 deleted
2018-11-13 18:02:06 [43580] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150022.perfdata.service-PID-43580 deleted
2018-11-13 18:02:06 [43580] [0] *** Timeout while processing Host: "cin-cue.global.amec.com" Service: "Ping"
2018-11-13 18:02:06 [43579] [0] *** Timeout while processing Host: "mty-gdc0.global.amec.com" Service: "_HOST_"
2018-11-13 18:02:06 [43580] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:02:06 [43579] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44396] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150036.perfdata.host-PID-44396 deleted
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-13 18:03:39 [44396] [0] *** Timeout while processing Host: "edc01-ora10p1.global.amec.com" Service: "_HOST_"
2018-11-13 18:03:39 [44396] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-13 18:03:39 [44397] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542150037.perfdata.service-PID-44397 deleted
2018-11-13 18:03:39 [44397] [0] *** Timeout while processing Host: "edm-plt1.global.amec.com" Service: "Ping"
2018-11-13 18:03:39 [44397] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Timeout after 80 secs. ***
2018-11-15 08:16:47 [23960] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: Please check your npcd.cfg
2018-11-15 08:16:48 [23960] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.host-PID-23960 deleted
2018-11-15 08:16:48 [23960] [0] *** Timeout while processing Host: "sas1-sw-core1.global.amec.com" Service: "_HOST_"
2018-11-15 08:16:48 [23960] [0] *** process_perfdata.pl terminated on signal ALRM
2018-11-15 08:16:47 [23961] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//1542287678.perfdata.service-PID-23961 deleted
2018-11-15 08:16:50 [23961] [0] *** Timeout while processing Host: "cal-sw-10a.global.amec.com" Service: "Ping"
2018-11-15 08:16:50 [23961] [0] *** process_perfdata.pl terminated on signal ALRM
[root@att1-nag1 mlilek]# tail -50 /usr/local/nagios/var/npcd.log
[01-03-2019 00:45:58] NPCD: WARN: MAX load reached: load 200.140000/80.000000 at i=1
[01-03-2019 00:46:10] NPCD: WARN: MAX load reached: load 177.340000/80.000000 at i=1
[01-03-2019 00:46:22] NPCD: WARN: MAX load reached: load 153.090000/80.000000 at i=1
[01-03-2019 00:46:34] NPCD: WARN: MAX load reached: load 133.030000/80.000000 at i=1
[01-03-2019 00:46:46] NPCD: WARN: MAX load reached: load 105.550000/80.000000 at i=1
[01-03-2019 00:46:58] NPCD: WARN: MAX load reached: load 91.100000/80.000000 at i=1
[01-03-2019 01:29:40] NPCD: WARN: MAX load reached: load 95.060000/80.000000 at i=0
[01-03-2019 01:29:52] NPCD: WARN: MAX load reached: load 125.580000/80.000000 at i=1
[01-03-2019 01:30:04] NPCD: WARN: MAX load reached: load 147.990000/80.000000 at i=1
[01-03-2019 01:30:16] NPCD: WARN: MAX load reached: load 162.670000/80.000000 at i=1
[01-03-2019 01:30:28] NPCD: WARN: MAX load reached: load 183.580000/80.000000 at i=1
[01-03-2019 01:30:40] NPCD: WARN: MAX load reached: load 192.930000/80.000000 at i=1
[01-03-2019 01:30:52] NPCD: WARN: MAX load reached: load 191.460000/80.000000 at i=1
[01-03-2019 01:31:04] NPCD: WARN: MAX load reached: load 204.210000/80.000000 at i=1
[01-03-2019 01:31:16] NPCD: WARN: MAX load reached: load 211.290000/80.000000 at i=1
[01-03-2019 01:31:28] NPCD: WARN: MAX load reached: load 197.990000/80.000000 at i=1
[01-03-2019 01:31:40] NPCD: WARN: MAX load reached: load 177.180000/80.000000 at i=1
[01-03-2019 01:31:52] NPCD: WARN: MAX load reached: load 143.340000/80.000000 at i=1
[01-03-2019 01:32:04] NPCD: WARN: MAX load reached: load 129.030000/80.000000 at i=1
[01-03-2019 01:32:16] NPCD: WARN: MAX load reached: load 110.880000/80.000000 at i=1
[01-03-2019 01:32:28] NPCD: WARN: MAX load reached: load 89.070000/80.000000 at i=1
[01-07-2019 09:19:47] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 09:22:20] NPCD: npcd Daemon (0.4.14) started with PID=40490
[01-07-2019 09:22:20] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 09:22:20] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 17:46:39] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 17:47:12] NPCD: npcd Daemon (0.4.14) started with PID=43472
[01-07-2019 17:47:12] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 17:47:12] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:05:16] NPCD: npcd Daemon (0.4.14) started with PID=2079
[01-07-2019 18:05:16] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:05:16] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:18:40] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:25:04] NPCD: npcd Daemon (0.4.14) started with PID=15756
[01-07-2019 18:25:04] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:25:04] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:38:37] NPCD: npcd Daemon (0.4.14) started with PID=36520
[01-07-2019 18:38:37] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:38:37] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 18:49:33] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-07-2019 18:50:09] NPCD: npcd Daemon (0.4.14) started with PID=29586
[01-07-2019 18:50:09] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 18:50:09] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-07-2019 19:14:07] NPCD: npcd Daemon (0.4.14) started with PID=36536
[01-07-2019 19:14:07] NPCD: Please have a look at 'npcd -V' to get license information
[01-07-2019 19:14:07] NPCD: HINT: load_threshold is enabled - ('80.000000')
[01-08-2019 09:20:37] NPCD: Caught Termination Signal - Hasta la vista... baby
[01-08-2019 09:21:02] NPCD: npcd Daemon (0.4.14) started with PID=57193
[01-08-2019 09:21:02] NPCD: Please have a look at 'npcd -V' to get license information
[01-08-2019 09:21:02] NPCD: HINT: load_threshold is enabled - ('80.000000')
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/xidpe | wc -l
0
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/perfdata/ | wc -l
2
[root@att1-nag1 mlilek]# ls /var/nagiosramdisk/spool/checkresults/ | wc -l
0
[root@att1-nag1 mlilek]# tail -50 /var/lib/pgsql/data/pg_log/postgresql-Tue.log
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No sLOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
LOG: could not write temporary statistics file "pg_stat_tmp/pgstat.tmp": No space left on device
[root@att1-nag1 mlilek]# tail -50 /var/lib/pgsql/pgstartup.log
could not write to log file: No space left on device
FATAL: could not write lock file "postmaster.pid": No space left on device
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 63382) running in data directory "/var/lib/pgsql/data"?
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 63382) running in data directory "/var/lib/pgsql/data"?
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 65438) running in data directory "/var/lib/pgsql/data"?
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 1683) running in data directory "/var/lib/pgsql/data"?
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 1683) running in data directory "/var/lib/pgsql/data"?
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 1678) running in data directory "/var/lib/pgsql/data"?
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 47220) running in data directory "/var/lib/pgsql/data"?
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 64328) running in data directory "/var/lib/pgsql/data"?
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 64328) running in data directory "/var/lib/pgsql/data"?
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to log file: No space left on device
could not write to lo[root@att1-nag1 mlilek]#
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 5:10 pm
by matt.lilek
Quickly noticed i was out of space all of a sudden, didnt expect that. i ran echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | psql nagiosxi nagiosxi and it cleared up 27GB. Got the backend started again and waiting for the Database Maintenance to run. If there is something else i should be looking at please let me know.
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 5:10 pm
by tgriep
Do the vacuuming of the postgress database to clean it up and restart it and then run the commands that I posted earlier.
The out of space message may be caused by tables that need to be cleaned up in the postgress database and it may help fix that.
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 5:16 pm
by matt.lilek
what is the command to Vacuum again? Used to be in my email before i was terminated. Oh and by truncating it has also brought memory usage down from 14GB to 4GB and subsequently fixed the ramdisk issue as well
Re: Database Repair has no effect
Posted: Tue Jan 08, 2019 5:25 pm
by tgriep
The commands are In this KB articke
https://support.nagios.com/kb/article/n ... ce-25.html
Under this section
The postgresql service is not running or the database is not accepting commands
Depending in the Postgres version, the commands are slightly different.
Re: Database Repair has no effect
Posted: Wed Jan 09, 2019 9:42 am
by matt.lilek
Hey Tom, sorry to have you post that reply. I didnt even read what you wrote above until after i posted the question. I should have wrote that back telling you i am able to read for myself. Thanks for the spoon feed though. Sometimes it is needed just not this time (on this part). Anyway.... Looks like i am back up and running since shortly after your last post. Think you can wrap this up.