Can't acknowledge alerts again
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Can't acknowledge alerts again
I've had this issue before where no one can acknowledge alerts (https://support.nagios.com/forum/viewto ... cknowledge). I've tried the things I did in that thread to correct the issue, but this time they did not fix it. Anything else I can try? Also, is there a way to prevent this from happening? It seems like it happens fairly frequently to us.
Re: Can't acknowledge alerts again
Do you see any error in the Apache Error logs when you try and Acknowledge an Alert?
Can you run the following as root on the Nagios server and post the output?
Thanks
Can you run the following as root on the Nagios server and post the output?
Code: Select all
ps -ef --cols=300
ls -al /usr/local/nagios/var/
ls -al /usr/local/nagios/var/rw/Be sure to check out our Knowledgebase for helpful articles and solutions!
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Can't acknowledge alerts again
Code: Select all
[root@lisl-ngos-01-pv snmptt]# ps -ef --cols=300
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Nov09 ? 00:10:07 /sbin/init
root 2 0 0 Nov09 ? 00:00:00 [kthreadd]
root 3 2 0 Nov09 ? 00:11:16 [migration/0]
root 4 2 0 Nov09 ? 00:04:26 [ksoftirqd/0]
root 5 2 0 Nov09 ? 00:00:00 [stopper/0]
root 6 2 0 Nov09 ? 00:00:34 [watchdog/0]
root 7 2 0 Nov09 ? 00:05:44 [migration/1]
root 8 2 0 Nov09 ? 00:00:00 [stopper/1]
root 9 2 0 Nov09 ? 00:01:29 [ksoftirqd/1]
root 10 2 0 Nov09 ? 00:00:03 [watchdog/1]
root 11 2 0 Nov09 ? 00:05:17 [migration/2]
root 12 2 0 Nov09 ? 00:00:00 [stopper/2]
root 13 2 0 Nov09 ? 00:03:07 [ksoftirqd/2]
root 14 2 0 Nov09 ? 00:00:04 [watchdog/2]
root 15 2 0 Nov09 ? 00:05:01 [migration/3]
root 16 2 0 Nov09 ? 00:00:00 [stopper/3]
root 17 2 0 Nov09 ? 00:01:32 [ksoftirqd/3]
root 18 2 0 Nov09 ? 00:00:13 [watchdog/3]
root 19 2 0 Nov09 ? 00:05:05 [migration/4]
root 20 2 0 Nov09 ? 00:00:00 [stopper/4]
root 21 2 0 Nov09 ? 00:03:30 [ksoftirqd/4]
root 22 2 0 Nov09 ? 00:00:08 [watchdog/4]
root 23 2 0 Nov09 ? 00:05:46 [migration/5]
root 24 2 0 Nov09 ? 00:00:00 [stopper/5]
root 25 2 0 Nov09 ? 00:01:35 [ksoftirqd/5]
root 26 2 0 Nov09 ? 00:00:21 [watchdog/5]
root 27 2 0 Nov09 ? 00:05:23 [migration/6]
root 28 2 0 Nov09 ? 00:00:00 [stopper/6]
root 29 2 0 Nov09 ? 00:03:19 [ksoftirqd/6]
root 30 2 0 Nov09 ? 00:00:15 [watchdog/6]
root 31 2 0 Nov09 ? 00:06:01 [migration/7]
root 32 2 0 Nov09 ? 00:00:00 [stopper/7]
root 33 2 0 Nov09 ? 00:01:31 [ksoftirqd/7]
root 34 2 0 Nov09 ? 00:00:02 [watchdog/7]
root 35 2 0 Nov09 ? 00:04:21 [events/0]
root 36 2 0 Nov09 ? 00:01:36 [events/1]
root 37 2 0 Nov09 ? 00:01:44 [events/2]
root 38 2 0 Nov09 ? 00:01:29 [events/3]
root 39 2 0 Nov09 ? 00:02:00 [events/4]
root 40 2 0 Nov09 ? 00:01:46 [events/5]
root 41 2 0 Nov09 ? 00:02:39 [events/6]
root 42 2 0 Nov09 ? 00:04:00 [events/7]
root 43 2 0 Nov09 ? 00:00:00 [events/0]
root 44 2 0 Nov09 ? 00:00:00 [events/1]
root 45 2 0 Nov09 ? 00:00:00 [events/2]
root 46 2 0 Nov09 ? 00:00:00 [events/3]
root 47 2 0 Nov09 ? 00:00:00 [events/4]
root 48 2 0 Nov09 ? 00:00:00 [events/5]
root 49 2 0 Nov09 ? 00:00:00 [events/6]
root 50 2 0 Nov09 ? 00:00:00 [events/7]
root 51 2 0 Nov09 ? 00:00:00 [events_long/0]
root 52 2 0 Nov09 ? 00:00:00 [events_long/1]
root 53 2 0 Nov09 ? 00:00:00 [events_long/2]
root 54 2 0 Nov09 ? 00:00:00 [events_long/3]
root 55 2 0 Nov09 ? 00:00:00 [events_long/4]
root 56 2 0 Nov09 ? 00:00:00 [events_long/5]
root 57 2 0 Nov09 ? 00:00:00 [events_long/6]
root 58 2 0 Nov09 ? 00:00:00 [events_long/7]
root 59 2 0 Nov09 ? 00:00:00 [events_power_ef]
root 60 2 0 Nov09 ? 00:00:00 [events_power_ef]
root 61 2 0 Nov09 ? 00:00:00 [events_power_ef]
root 62 2 0 Nov09 ? 00:00:00 [events_power_ef]
root 63 2 0 Nov09 ? 00:00:00 [events_power_ef]
root 64 2 0 Nov09 ? 00:00:00 [events_power_ef]
root 65 2 0 Nov09 ? 00:00:00 [events_power_ef]
root 66 2 0 Nov09 ? 00:00:00 [events_power_ef]
root 67 2 0 Nov09 ? 00:00:00 [cgroup]
root 68 2 0 Nov09 ? 00:00:00 [khelper]
root 69 2 0 Nov09 ? 00:00:00 [netns]
root 70 2 0 Nov09 ? 00:00:00 [async/mgr]
root 71 2 0 Nov09 ? 00:00:00 [pm]
root 72 2 0 Nov09 ? 00:00:08 [sync_supers]
root 73 2 0 Nov09 ? 00:00:00 [bdi-default]
root 74 2 0 Nov09 ? 00:00:00 [kintegrityd/0]
root 75 2 0 Nov09 ? 00:00:00 [kintegrityd/1]
root 76 2 0 Nov09 ? 00:00:00 [kintegrityd/2]
root 77 2 0 Nov09 ? 00:00:00 [kintegrityd/3]
root 78 2 0 Nov09 ? 00:00:00 [kintegrityd/4]
root 79 2 0 Nov09 ? 00:00:00 [kintegrityd/5]
root 80 2 0 Nov09 ? 00:00:00 [kintegrityd/6]
root 81 2 0 Nov09 ? 00:00:00 [kintegrityd/7]
root 82 2 0 Nov09 ? 00:05:24 [kblockd/0]
root 83 2 0 Nov09 ? 00:00:18 [kblockd/1]
root 84 2 0 Nov09 ? 00:06:08 [kblockd/2]
root 85 2 0 Nov09 ? 00:00:18 [kblockd/3]
root 86 2 0 Nov09 ? 00:06:08 [kblockd/4]
root 87 2 0 Nov09 ? 00:00:16 [kblockd/5]
root 88 2 0 Nov09 ? 00:06:01 [kblockd/6]
root 89 2 0 Nov09 ? 00:00:14 [kblockd/7]
root 90 2 0 Nov09 ? 00:00:00 [kacpid]
root 91 2 0 Nov09 ? 00:00:00 [kacpi_notify]
root 92 2 0 Nov09 ? 00:00:00 [kacpi_hotplug]
root 93 2 0 Nov09 ? 00:00:00 [ata_aux]
root 94 2 0 Nov09 ? 00:00:00 [ata_sff/0]
root 95 2 0 Nov09 ? 00:00:00 [ata_sff/1]
root 96 2 0 Nov09 ? 00:00:00 [ata_sff/2]
root 97 2 0 Nov09 ? 00:00:00 [ata_sff/3]
root 98 2 0 Nov09 ? 00:00:00 [ata_sff/4]
root 99 2 0 Nov09 ? 00:00:00 [ata_sff/5]
root 100 2 0 Nov09 ? 00:00:00 [ata_sff/6]
root 101 2 0 Nov09 ? 00:00:00 [ata_sff/7]
root 102 2 0 Nov09 ? 00:00:00 [ksuspend_usbd]
root 103 2 0 Nov09 ? 00:00:00 [khubd]
root 104 2 0 Nov09 ? 00:00:00 [kseriod]
root 105 2 0 Nov09 ? 00:00:00 [md/0]
root 106 2 0 Nov09 ? 00:00:00 [md/1]
root 107 2 0 Nov09 ? 00:00:00 [md/2]
root 108 2 0 Nov09 ? 00:00:00 [md/3]
root 109 2 0 Nov09 ? 00:00:00 [md/4]
root 110 2 0 Nov09 ? 00:00:00 [md/5]
root 111 2 0 Nov09 ? 00:00:00 [md/6]
root 112 2 0 Nov09 ? 00:00:00 [md/7]
root 113 2 0 Nov09 ? 00:00:00 [md_misc/0]
root 114 2 0 Nov09 ? 00:00:00 [md_misc/1]
root 115 2 0 Nov09 ? 00:00:00 [md_misc/2]
root 116 2 0 Nov09 ? 00:00:00 [md_misc/3]
root 117 2 0 Nov09 ? 00:00:00 [md_misc/4]
root 118 2 0 Nov09 ? 00:00:00 [md_misc/5]
root 119 2 0 Nov09 ? 00:00:00 [md_misc/6]
root 120 2 0 Nov09 ? 00:00:00 [md_misc/7]
root 121 2 0 Nov09 ? 00:00:00 [linkwatch]
root 123 2 0 Nov09 ? 00:00:01 [khungtaskd]
root 124 2 0 Nov09 ? 00:10:01 [kswapd0]
root 125 2 0 Nov09 ? 00:00:00 [ksmd]
root 126 2 0 Nov09 ? 00:05:48 [khugepaged]
root 127 2 0 Nov09 ? 00:00:00 [aio/0]
root 128 2 0 Nov09 ? 00:00:00 [aio/1]
root 129 2 0 Nov09 ? 00:00:00 [aio/2]
root 130 2 0 Nov09 ? 00:00:00 [aio/3]
root 131 2 0 Nov09 ? 00:00:00 [aio/4]
root 132 2 0 Nov09 ? 00:00:00 [aio/5]
root 133 2 0 Nov09 ? 00:00:00 [aio/6]
root 134 2 0 Nov09 ? 00:00:00 [aio/7]
root 135 2 0 Nov09 ? 00:00:00 [crypto/0]
root 136 2 0 Nov09 ? 00:00:00 [crypto/1]
root 137 2 0 Nov09 ? 00:00:00 [crypto/2]
root 138 2 0 Nov09 ? 00:00:00 [crypto/3]
root 139 2 0 Nov09 ? 00:00:00 [crypto/4]
root 140 2 0 Nov09 ? 00:00:00 [crypto/5]
root 141 2 0 Nov09 ? 00:00:00 [crypto/6]
root 142 2 0 Nov09 ? 00:00:00 [crypto/7]
root 149 2 0 Nov09 ? 00:00:00 [kthrotld/0]
root 150 2 0 Nov09 ? 00:00:00 [kthrotld/1]
root 151 2 0 Nov09 ? 00:00:00 [kthrotld/2]
root 152 2 0 Nov09 ? 00:00:00 [kthrotld/3]
root 153 2 0 Nov09 ? 00:00:00 [kthrotld/4]
root 154 2 0 Nov09 ? 00:00:00 [kthrotld/5]
root 155 2 0 Nov09 ? 00:00:00 [kthrotld/6]
root 156 2 0 Nov09 ? 00:00:00 [kthrotld/7]
root 157 2 0 Nov09 ? 00:00:00 [pciehpd]
root 159 2 0 Nov09 ? 00:00:00 [kpsmoused]
root 160 2 0 Nov09 ? 00:00:00 [usbhid_resumer]
root 161 2 0 Nov09 ? 00:00:00 [deferwq]
root 194 2 0 Nov09 ? 00:00:00 [kdmremove]
root 195 2 0 Nov09 ? 00:00:00 [kstriped]
root 226 2 0 Nov09 ? 00:00:00 [ttm_swap]
postgres 315 2069 0 16:33 ? 00:00:04 postgres: nagiosxi nagiosxi ::1(51316) idle
root 407 2 0 Nov09 ? 00:00:00 [scsi_eh_0]
root 408 2 0 Nov09 ? 00:00:00 [scsi_eh_1]
apache 411 24325 8 16:33 ? 00:01:48 /usr/sbin/httpd
root 497 2 0 Nov09 ? 00:00:48 [mpt_poll_0]
root 498 2 0 Nov09 ? 00:00:00 [mpt/0]
root 499 2 0 Nov09 ? 00:00:00 [scsi_eh_2]
postgres 564 2069 0 16:33 ? 00:00:03 postgres: nagiosxi nagiosxi ::1(51572) idle
root 568 2 0 Nov09 ? 00:00:00 [kdmflush]
root 569 2 0 Nov09 ? 00:00:00 [kdmflush]
root 588 2 0 Nov09 ? 00:32:41 [jbd2/dm-1-8]
root 589 2 0 Nov09 ? 00:00:00 [ext4-dio-unwrit]
root 655 2219 0 16:55 ? 00:00:00 CROND
root 660 2219 0 16:55 ? 00:00:00 CROND
root 661 2219 0 16:55 ? 00:00:00 CROND
root 662 2219 0 16:55 ? 00:00:00 CROND
root 663 2219 0 16:55 ? 00:00:00 CROND
root 664 2219 0 16:55 ? 00:00:00 CROND
root 665 2219 0 16:55 ? 00:00:00 CROND
nagios 684 662 0 16:55 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php >> /usr/local/nagiosxi/var/event_handler.log 2>&1
nagios 688 660 0 16:55 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1
root 689 655 9 16:55 ? 00:00:01 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
root 691 1 0 Nov09 ? 00:00:00 /sbin/udevd -d
nagios 692 661 0 16:55 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios 694 663 0 16:55 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php >> /usr/local/nagiosxi/var/eventman.log 2>&1
nagios 695 665 0 16:55 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 696 664 0 16:55 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
root 924 2 0 Nov09 ? 00:00:44 [vmmemctl]
nagios 1063 29133 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.146.146.254 -w 3000.0 80 -c 5000.0 100 -p 5
root 1224 2 0 Nov09 ? 00:02:46 [kauditd]
nagios 1274 29134 1 16:55 ? 00:00:00 /usr/bin/perl /usr/local/nagios/libexec/check_openmanage.pl -s -t 90 -H 10.6.33.30 -C openviewm -b bp=0
nagios 1276 29129 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.245.128.140 -C ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_CPU_Info --server 10.245.128.130 --host lisl-vesx-06-pp.snaponglobal.com
nagios 1289 29131 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H kenprod05g.snapon.com -p 5668 -t 70 -c check_total_zpool_io -a -w 80 -c 90
nagios 1294 1276 0 16:55 ? 00:00:00 /usr/bin/ssh -l vi-admin 10.245.128.140 ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_CPU_Info --server 10.245.128.130 --host lisl-vesx-06-pp.snaponglobal.com
root 1390 2 0 Nov09 ? 02:08:36 [flush-253:1]
root 1407 1 0 Nov22 ? 00:00:00 python /usr/local/bin/snmptraphandling.py 10.73.19.2 SNMP Traps Normal 1511402203 enterprises.9.2.1.5.0 ():10.73.104.59 enterprises.9.9.412.1.1.1.0 ():1 enterprises.9.9.412.1.1.2.0 ():10.73.104.59 An authenticationFailure trap signifies that the SNMP 1
nagios 1457 29131 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.245.128.140 -C ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_Status --server 10.245.128.130 --host lisl-vesx-17-pp.snaponglobal.com
nagios 1458 1457 0 16:55 ? 00:00:00 /usr/bin/ssh -l vi-admin 10.245.128.140 ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_Status --server 10.245.128.130 --host lisl-vesx-17-pp.snaponglobal.com
root 1468 689 5 16:55 ? 00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
root 1469 689 0 16:55 ? 00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
root 1470 689 1 16:55 ? 00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
root 1471 689 12 16:55 ? 00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
smmsp 1473 655 0 16:55 ? 00:00:00 /usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t -f root
nagios 1475 29129 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisdbms14p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n lisdbms14p -m 100 -w 80 -c 90
nagios 1498 29134 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lishadb26p.snapon.com -u -p 5668 -t 100 -c check_zone_cpu -a -n lishadb26p -w 80 -c 90
root 1501 1 0 Nov09 ? 00:02:53 auditd
root 1523 1 0 Nov09 ? 00:06:13 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root 1559 1 0 Nov09 ? 00:01:43 irqbalance --pid=/var/run/irqbalance.pid
dbus 1578 1 0 Nov09 ? 00:00:01 dbus-daemon --system
root 1626 1 0 Nov09 ? 00:00:00 /usr/sbin/acpid
68 1638 1 0 Nov09 ? 00:00:11 hald
root 1639 1638 0 Nov09 ? 00:00:00 hald-runner
nagios 1643 29127 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisdbqy01p.snapon.com -u -p 5668 -t 100 -c check_zone_cpu -a -n lisdbqy01p -w 80 -c 90
root 1671 1639 0 Nov09 ? 00:00:00 hald-addon-input: Listening on /dev/input/event2 /dev/input/event0
68 1681 1639 0 Nov09 ? 00:00:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
nagios 1684 29136 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H kenprod05g.snapon.com -u -p 5668 -t 70 -c check_swap -a -w 50 -c 40
nagios 1693 694 5 16:55 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
postgres 1698 2069 0 16:55 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(51502) idle
nagios 1715 29128 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisprod14g.snapon.com -p 5668 -t 70 -c check_total_zpool_io -a -w 80 -c 90
nagios 1724 29136 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H kendbdg06p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n kendbdg06p -w 80 -c 90
nagios 1772 29126 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H kendbdg22p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n kendbdg22p -w 80 -c 90
root 1803 1 0 Nov09 ? 00:06:09 /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid
nagios 1807 29130 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisachr03p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n lisachr03p -m 100 -w 80 -c 90
root 1814 1 0 Nov09 ? 00:29:42 /usr/sbin/snmptrapd -Lsd -On -p /var/run/snmptrapd.pid
root 1851 1 0 Nov09 ? 00:01:25 /usr/sbin/sshd
root 1862 1 0 Nov09 ? 00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
nagios 1892 29129 8 16:55 ? 00:00:00 /usr/bin/perl /usr/local/nagios/libexec/check_openmanage.pl -s -t 90 -H 10.9.129.113 -C openviewm -b bp=0
nagios 1893 29131 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisprod05g.snapon.com -u -p 5668 -t 70 -c check_cpu_stats -a -w :80 -c :90
nagios 1942 29127 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.0.18.111 -C ~/box293_check_vmware.pl --concurrent_checks 400 --username NagiosRO --check Host_Storage_Adapter_Info --server 10.0.18.128 --host keno-vesx-19-pp.snaponglobal.com
nagios 1943 1942 1 16:55 ? 00:00:00 /usr/bin/ssh -l vi-admin 10.0.18.111 ~/box293_check_vmware.pl --concurrent_checks 400 --username NagiosRO --check Host_Storage_Adapter_Info --server 10.0.18.128 --host keno-vesx-19-pp.snaponglobal.com
apache 1956 24325 5 14:29 ? 00:08:39 /usr/sbin/httpd
nagios 1962 688 9 16:55 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 1965 696 9 16:55 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios 1967 684 8 16:55 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php
nagios 1970 29125 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_snmp -H 10.0.18.231 -t 60 -C Sn4ponCudaH -o .1.3.6.1.4.1.20632.5.7
nagios 1971 695 19 16:55 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios 1972 692 9 16:55 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
nagios 1974 1970 0 16:55 ? 00:00:00 /usr/bin/snmpget -Le -t 60 -r 5 -m -v 1 -c 10.0.18.231:161 .1.3.6.1.4.1.20632.5.7
nagios 1975 29126 8 16:55 ? 00:00:00 /usr/bin/perl /usr/local/nagios/libexec/check_openmanage.pl -s -t 90 -H 10.9.129.45 -C openviewm -b bp=0
postgres 1988 2069 0 14:29 ? 00:00:20 postgres: nagiosxi nagiosxi ::1(38528) idle
nagios 1993 29128 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.0.18.111 -C ~/box293_check_vmware.pl --concurrent_checks 400 --username NagiosRO --check Host_Memory_Usage --server 10.0.18.128 --host keno-vesx-20-pp.snaponglobal.com --warning memory
nagios 1995 1993 0 16:55 ? 00:00:00 /usr/bin/ssh -l vi-admin 10.0.18.111 ~/box293_check_vmware.pl --concurrent_checks 400 --username NagiosRO --check Host_Memory_Usage --server 10.0.18.128 --host keno-vesx-20-pp.snaponglobal.com --warning memory_used:409 --critical memory_used:460
postgres 1996 2069 0 16:55 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(51646) idle
postgres 1999 2069 0 16:55 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(51650) idle
postgres 2005 2069 0 16:55 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(51654) idle
postgres 2016 2069 0 16:55 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(51656) idle
postgres 2018 2069 0 16:55 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(51658) idle
nagios 2039 29135 0 16:55 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_nwc_health --hostname 10.6.84.1 --community Sn4p0nF1r3W4LL --mode cpu-load --units % --warning 80 --critical 90
nagios 2042 29136 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.19.129.160 -C ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_vNIC_Status --server 10.19.129.150 --host conw-vesx-03-pp.snaponglobal.com
nagios 2043 2042 0 16:55 ? 00:00:00 /usr/bin/ssh -l vi-admin 10.19.129.160 ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_vNIC_Status --server 10.19.129.150 --host conw-vesx-03-pp.snaponglobal.com
nagios 2056 29131 0 16:55 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_nwc_health --hostname 10.160.19.2 --community Sn4p0nC0r3s --mode memory-usage --units % --warning 80 --critical 90
nagios 2065 29135 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H kendbdg24p.snapon.com -u -p 5668 -t 70 -c check_init_service -a svc:/network/sendmail-client:default
nagios 2066 29132 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisacic01p.snapon.com -p 5668 -t 70 -c check_cpu_latency -a -w 80 -c 90
nagios 2067 29130 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H kendbdg35p.snapon.com -u -p 5668 -t 70 -c check_init_service -a svc:/network/rpc/gss:default
postgres 2069 1 0 Nov09 ? 00:05:42 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
nagios 2070 29127 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lishadb24p.snapon.com -u -p 5668 -t 70 -c check_init_service -a svc:/network/sendmail-client:default
nagios 2071 29134 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisqpas01p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n lisqpas01p -m 100 -w 80 -c 90
nagios 2072 1971 0 16:55 ? 00:00:00 /bin/sh /etc/init.d/nagios status
nagios 2075 2232 0 16:55 ? 00:00:00 /usr/bin/perl /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1512514510.perfdata.service
nagios 2076 2232 0 16:55 ? 00:00:00 /usr/bin/perl /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1512514510.perfdata.host
nagios 2078 29129 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisebus02p.snapon.com -u -p 5668 -t 100 -c check_zone_cpu -a -n lisebus02p -w 80 -c 90
nagios 2086 29131 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisprod10g.snapon.com -u -p 5668 -t 70 -c check_services -a sendmail 2 approx
nagios 2087 29125 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisdbms11p.snapon.com -u -p 5668 -t 70 -c check_init_service -a svc:/system/filesystem/autofs:default
root 2094 2779 0 16:55 pts/0 00:00:00 ps -ef --cols=300
nagios 2095 29126 0 16:55 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisprod05g.snapon.com -p 5668 -t 70 -c check_cpu_latency -a -w 80 -c 90
root 2121 1 0 Nov09 ? 00:02:34 sendmail: accepting connections
smmsp 2138 1 0 Nov09 ? 00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root 2159 1 0 Nov09 ? 00:00:00 /usr/sbin/abrtd
root 2190 1 0 Nov09 ? 00:01:00 abrt-dump-oops -d /var/spool/abrt -rwx /var/log/messages
root 2219 1 0 Nov09 ? 00:04:46 crond
nagios 2232 1 0 Nov09 ? 00:02:17 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
root 2246 1 0 Nov09 ? 00:00:00 /usr/sbin/atd
postgres 2287 2069 0 Nov09 ? 00:00:53 postgres: logger process
postgres 2289 2069 0 Nov09 ? 00:06:53 postgres: writer process
postgres 2290 2069 0 Nov09 ? 00:04:24 postgres: wal writer process
postgres 2291 2069 0 Nov09 ? 00:01:44 postgres: autovacuum launcher process
postgres 2292 2069 0 Nov09 ? 00:11:37 postgres: stats collector process
ajaxterm 2293 1 0 Nov09 ? 00:09:21 python /usr/share/ajaxterm/ajaxterm.py --daemon --port=8022 --uid=ajaxterm
root 2374 1851 0 Nov09 ? 00:00:14 sshd: root@pts/0
root 2776 1 0 Nov09 tty1 00:00:00 /sbin/mingetty /dev/tty1
root 2778 1 0 Nov09 tty2 00:00:00 /sbin/mingetty /dev/tty2
root 2779 2374 0 Nov09 pts/0 00:00:02 -bash
root 2783 1 0 Nov09 tty3 00:00:00 /sbin/mingetty /dev/tty3
root 2787 1 0 Nov09 tty4 00:00:00 /sbin/mingetty /dev/tty4
root 2790 1 0 Nov09 tty5 00:00:00 /sbin/mingetty /dev/tty5
root 2792 1 0 Nov09 tty6 00:00:00 /sbin/mingetty /dev/tty6
ntp 2902 1 0 Nov09 ? 00:00:19 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
apache 2990 24325 7 16:12 ? 00:03:08 /usr/sbin/httpd
postgres 3002 2069 0 16:12 ? 00:00:07 postgres: nagiosxi nagiosxi ::1(55182) idle
apache 5178 24325 6 15:40 ? 00:05:03 /usr/sbin/httpd
postgres 5276 2069 0 15:40 ? 00:00:11 postgres: nagiosxi nagiosxi ::1(60134) idle
apache 12659 24325 8 16:35 ? 00:01:38 /usr/sbin/httpd
postgres 12679 2069 0 16:35 ? 00:00:03 postgres: nagiosxi nagiosxi ::1(33482) idle
apache 12781 24325 6 15:57 ? 00:03:59 /usr/sbin/httpd
postgres 12789 2069 0 15:57 ? 00:00:09 postgres: nagiosxi nagiosxi ::1(36776) idle
apache 13431 24325 6 15:36 ? 00:05:22 /usr/sbin/httpd
postgres 13458 2069 0 15:36 ? 00:00:12 postgres: nagiosxi nagiosxi ::1(39648) idle
apache 14080 24325 6 15:41 ? 00:05:02 /usr/sbin/httpd
postgres 14102 2069 0 15:41 ? 00:00:11 postgres: nagiosxi nagiosxi ::1(39668) idle
apache 15268 24325 7 16:14 ? 00:02:55 /usr/sbin/httpd
postgres 15323 2069 0 16:14 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(37496) SELECT
root 15448 1851 0 Nov27 ? 00:00:04 sshd: root@notty
root 17580 15448 0 Nov27 ? 00:00:00 /usr/libexec/openssh/sftp-server
apache 17720 24325 8 16:52 ? 00:00:14 /usr/sbin/httpd
root 17860 1 0 Dec01 ? 00:06:23 /usr/bin/perl /usr/sbin/snmptt --daemon
postgres 17877 2069 0 16:52 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(37480) idle
apache 18344 24325 7 16:14 ? 00:02:54 /usr/sbin/httpd
postgres 18365 2069 0 16:14 ? 00:00:06 postgres: nagiosxi nagiosxi ::1(40124) idle
root 18735 1 0 Nov10 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysql
mysql 18840 18735 22 Nov10 pts/0 5-16:40:07 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
nagios 18971 1 0 Nov10 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
apache 23378 24325 6 15:21 ? 00:06:03 /usr/sbin/httpd
postgres 23402 2069 0 15:21 ? 00:00:13 postgres: nagiosxi nagiosxi ::1(50274) idle
root 24325 1 0 Nov27 ? 00:00:34 /usr/sbin/httpd
apache 24756 24325 6 15:32 ? 00:05:31 /usr/sbin/httpd
postgres 24826 2069 0 15:32 ? 00:00:12 postgres: nagiosxi nagiosxi ::1(49858) idle
nagios 29122 1 2 08:28 ? 00:14:14 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 29125 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29126 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29127 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29128 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29129 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29130 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29131 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29132 29122 0 08:28 ? 00:00:37 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29133 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29134 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29135 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29136 29122 0 08:28 ? 00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 29144 18971 0 08:28 ? 00:02:08 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 29145 29144 12 08:28 ? 01:03:57 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 29243 29122 0 08:28 ? 00:00:00 [nagios] <defunct>
apache 29484 24325 6 15:22 ? 00:06:01 /usr/sbin/httpd
postgres 29549 2069 0 15:22 ? 00:00:13 postgres: nagiosxi nagiosxi ::1(55384) idle
apache 29820 24325 6 14:39 ? 00:08:08 /usr/sbin/httpd
postgres 29855 2069 0 14:39 ? 00:00:18 postgres: nagiosxi nagiosxi ::1(33082) idle
root 30171 1 0 08:28 ? 00:00:32 /usr/bin/perl /usr/sbin/snmptt --daemon
apache 31886 24325 6 15:39 ? 00:05:08 /usr/sbin/httpd
postgres 31906 2069 0 15:39 ? 00:00:11 postgres: nagiosxi nagiosxi ::1(54986) idle
root 32168 2 0 Nov28 ? 00:00:00 [bluetooth]
root 32181 691 0 Nov28 ? 00:00:00 /sbin/udevd -d
root 32182 691 0 Nov28 ? 00:00:00 /sbin/udevd -d
apache 32765 24325 8 16:33 ? 00:01:48 /usr/sbin/httpd
[root@lisl-ngos-01-pv snmptt]# ls -al /usr/local/nagios/var/
total 164392
drwxrwxr-x. 6 nagios nagios 4096 Dec 5 16:55 .
drwxr-xr-x. 10 root root 4096 May 13 2014 ..
drwxrwxr-x. 2 nagios nagios 77824 Dec 4 23:59 archives
-rw-r--r--. 1 apache apache 50013986 May 11 2015 graphapi.log
-rw-r--r--. 1 nagios nagios 7978 Apr 25 2017 host-perfdata
-rw-r--r--. 1 nagios nagios 22137 Dec 5 10:20 nagios.configtest
-rw-r--r--. 1 nagios nagios 6 Dec 5 08:28 nagios.lock
-rw-rw-r--. 1 nagios nagios 3086564 Dec 5 16:55 nagios.log
-rw-rw-r--. 1 nagios users 19380236 Jan 15 2015 nagios.tmp1QpCnR
-rw-------. 1 nagios nagios 1323008 Jan 15 2015 nagios.tmpcY0Kxp
-rw-r--r--. 1 nagios nagios 6 Nov 10 09:34 ndo2db.lock
-rw-r--r--. 1 nagios nagios 0 Dec 5 08:28 ndomod.tmp
srwxr-xr-x. 1 nagios nagios 0 Nov 10 09:34 ndo.sock
-rw-r--r--. 1 nagios nagios 9385374 Nov 10 08:43 npcd.log
-rw-r--r--. 1 nagios nagios 10485817 Oct 1 2013 npcd.log.old
-rw-r--r--. 1 nagios nagios 13069207 Apr 25 2017 objects.cache
-rw-r--r--. 1 nagios nagios 13624873 Dec 5 10:20 objects.precache
-rw-rw-rw-. 1 nagios nagios 5524925 Nov 10 07:29 perfdata.log
-rw-------. 1 nagios nagios 20776831 Dec 5 16:28 retention.dat
-rw-------. 1 root root 21174331 Nov 9 12:03 retention.dat.2017.11.9
drwxrwsr-x. 2 nagios nagcmd 4096 Dec 5 08:28 rw
-rw-r--r--. 1 nagios nagios 230597 Apr 25 2017 service-perfdata
drwxr-xr-x. 5 root root 4096 Nov 13 2013 spool
drwxr-xr-x. 2 nagios nagios 4096 Dec 5 16:55 stats
[root@lisl-ngos-01-pv snmptt]# ls -al /usr/local/nagios/var/rw/
total 100
drwxrwsr-x. 2 nagios nagcmd 4096 Dec 5 08:28 .
drwxrwxr-x. 6 nagios nagios 4096 Dec 5 16:55 ..
-rw-rw-r--. 1 root nagcmd 266 Dec 5 16:55 nagios.cmd
srw-rw----. 1 nagios nagcmd 0 Dec 5 08:28 nagios.qh
-rw-rw-r--. 1 nagios nagcmd 83082 Jul 27 19:12 nsca.dump-
kyang
Re: Can't acknowledge alerts again
Were there any errors in the apache logs? Or any other relevant log errors you've seen?
How often has this been happening? Was it after any major upgrade or it just happens out of the blue?
How often has this been happening? Was it after any major upgrade or it just happens out of the blue?
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Can't acknowledge alerts again
No I don't see anything in the apache error logs, and i'm not sure what other logs I should be checking. This happens fairly often, it's really kind of hard to pin down a pattern but this has been an ongoing issue for some time. Typically killing nagios and restarting it or applying a new config will fix the issue but not this time. Even when it does fix the issue this is a fairly annoying issue because I'm the only Nagios admin here so if operations can't acknowledge alerts at 2 in the morning I get a phone call about it since I'm the only one that can fix it. I don't know what causes it but I'd really like to understand more about why this happens so often and how to prevent it from happening. This was not after a major upgrade, no.
Re: Can't acknowledge alerts again
Hello,
Seems you experience similar issues as I do sometimes... Restarting nagios service helps for some time. See thread https://support.nagios.com/forum/viewto ... 16&t=44467
Did you try the suggestions in https://support.nagios.com/kb/article/n ... d-139.html yet?
Grtz
Willem
Seems you experience similar issues as I do sometimes... Restarting nagios service helps for some time. See thread https://support.nagios.com/forum/viewto ... 16&t=44467
Did you try the suggestions in https://support.nagios.com/kb/article/n ... d-139.html yet?
Grtz
Willem
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Can't acknowledge alerts again
I have not tried that yet, I'll take a look. Yeah, it definitely seems like we're experiencing the same issue here, and it's one of the most annoying issues I think I've experienced using Nagios. I'll take a look at that page you linked, thanks Willem.
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Can't acknowledge alerts again
I just looked and the values on those conf files are already higher on my server so that's not it.
Code: Select all
[root@lisl-ngos-01-pv httpd]# grep 'kernel.msgmnb' /etc/sysctl.conf
grep 'kernel.msgmax' /etc/sysctl.conf
kernel.msgmnb = 524288000
[root@lisl-ngos-01-pv httpd]# grep 'kernel.msgmax' /etc/sysctl.conf
kernel.msgmax = 524288000
[root@lisl-ngos-01-pv httpd]# grep 'kernel.msgmni' /etc/sysctl.conf
kernel.msgmni = 512000-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Can't acknowledge alerts again
Are you able to acknowledge in the core UI, or not at all?
Is the following still essentially accurate?
Total Hosts: 1219
Total Services: 13344
Are most of your check intervals still 5 minutes?
In your ndo2db.cfg the old profile I have has your debug_levelset to 0. Please set that to -1, restart ndo2db, and let it run for a bit and then send /usr/local/nagios/var/ndo2db.debug
Depending on what sort of log rotation and space you have, you may want to turn that off. It shouldn't get bigger than max_debug_file_size=1000000.
It might be useful to get a profile. You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.
You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh
That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
Is the following still essentially accurate?
Total Hosts: 1219
Total Services: 13344
Are most of your check intervals still 5 minutes?
In your ndo2db.cfg the old profile I have has your debug_levelset to 0. Please set that to -1, restart ndo2db, and let it run for a bit and then send /usr/local/nagios/var/ndo2db.debug
Depending on what sort of log rotation and space you have, you may want to turn that off. It shouldn't get bigger than max_debug_file_size=1000000.
It might be useful to get a profile. You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.
You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh
That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Can't acknowledge alerts again
Can't acknowledge at all. Yeah, pretty similar 1225 hosts, 13370 services now. Where is that config file located? Profile is too big (2.5M) to attach to PM.