Can't acknowledge alerts again

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Can't acknowledge alerts again

Post by snapon_admin »

I've had this issue before where no one can acknowledge alerts (https://support.nagios.com/forum/viewto ... cknowledge). I've tried the things I did in that thread to correct the issue, but this time they did not fix it. Anything else I can try? Also, is there a way to prevent this from happening? It seems like it happens fairly frequently to us.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Can't acknowledge alerts again

Post by tgriep »

Do you see any error in the Apache Error logs when you try and Acknowledge an Alert?
Can you run the following as root on the Nagios server and post the output?

Code: Select all

ps -ef --cols=300
ls -al /usr/local/nagios/var/
ls -al /usr/local/nagios/var/rw/
Thanks
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Can't acknowledge alerts again

Post by snapon_admin »

Code: Select all

[root@lisl-ngos-01-pv snmptt]# ps -ef --cols=300
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Nov09 ?        00:10:07 /sbin/init
root         2     0  0 Nov09 ?        00:00:00 [kthreadd]
root         3     2  0 Nov09 ?        00:11:16 [migration/0]
root         4     2  0 Nov09 ?        00:04:26 [ksoftirqd/0]
root         5     2  0 Nov09 ?        00:00:00 [stopper/0]
root         6     2  0 Nov09 ?        00:00:34 [watchdog/0]
root         7     2  0 Nov09 ?        00:05:44 [migration/1]
root         8     2  0 Nov09 ?        00:00:00 [stopper/1]
root         9     2  0 Nov09 ?        00:01:29 [ksoftirqd/1]
root        10     2  0 Nov09 ?        00:00:03 [watchdog/1]
root        11     2  0 Nov09 ?        00:05:17 [migration/2]
root        12     2  0 Nov09 ?        00:00:00 [stopper/2]
root        13     2  0 Nov09 ?        00:03:07 [ksoftirqd/2]
root        14     2  0 Nov09 ?        00:00:04 [watchdog/2]
root        15     2  0 Nov09 ?        00:05:01 [migration/3]
root        16     2  0 Nov09 ?        00:00:00 [stopper/3]
root        17     2  0 Nov09 ?        00:01:32 [ksoftirqd/3]
root        18     2  0 Nov09 ?        00:00:13 [watchdog/3]
root        19     2  0 Nov09 ?        00:05:05 [migration/4]
root        20     2  0 Nov09 ?        00:00:00 [stopper/4]
root        21     2  0 Nov09 ?        00:03:30 [ksoftirqd/4]
root        22     2  0 Nov09 ?        00:00:08 [watchdog/4]
root        23     2  0 Nov09 ?        00:05:46 [migration/5]
root        24     2  0 Nov09 ?        00:00:00 [stopper/5]
root        25     2  0 Nov09 ?        00:01:35 [ksoftirqd/5]
root        26     2  0 Nov09 ?        00:00:21 [watchdog/5]
root        27     2  0 Nov09 ?        00:05:23 [migration/6]
root        28     2  0 Nov09 ?        00:00:00 [stopper/6]
root        29     2  0 Nov09 ?        00:03:19 [ksoftirqd/6]
root        30     2  0 Nov09 ?        00:00:15 [watchdog/6]
root        31     2  0 Nov09 ?        00:06:01 [migration/7]
root        32     2  0 Nov09 ?        00:00:00 [stopper/7]
root        33     2  0 Nov09 ?        00:01:31 [ksoftirqd/7]
root        34     2  0 Nov09 ?        00:00:02 [watchdog/7]
root        35     2  0 Nov09 ?        00:04:21 [events/0]
root        36     2  0 Nov09 ?        00:01:36 [events/1]
root        37     2  0 Nov09 ?        00:01:44 [events/2]
root        38     2  0 Nov09 ?        00:01:29 [events/3]
root        39     2  0 Nov09 ?        00:02:00 [events/4]
root        40     2  0 Nov09 ?        00:01:46 [events/5]
root        41     2  0 Nov09 ?        00:02:39 [events/6]
root        42     2  0 Nov09 ?        00:04:00 [events/7]
root        43     2  0 Nov09 ?        00:00:00 [events/0]
root        44     2  0 Nov09 ?        00:00:00 [events/1]
root        45     2  0 Nov09 ?        00:00:00 [events/2]
root        46     2  0 Nov09 ?        00:00:00 [events/3]
root        47     2  0 Nov09 ?        00:00:00 [events/4]
root        48     2  0 Nov09 ?        00:00:00 [events/5]
root        49     2  0 Nov09 ?        00:00:00 [events/6]
root        50     2  0 Nov09 ?        00:00:00 [events/7]
root        51     2  0 Nov09 ?        00:00:00 [events_long/0]
root        52     2  0 Nov09 ?        00:00:00 [events_long/1]
root        53     2  0 Nov09 ?        00:00:00 [events_long/2]
root        54     2  0 Nov09 ?        00:00:00 [events_long/3]
root        55     2  0 Nov09 ?        00:00:00 [events_long/4]
root        56     2  0 Nov09 ?        00:00:00 [events_long/5]
root        57     2  0 Nov09 ?        00:00:00 [events_long/6]
root        58     2  0 Nov09 ?        00:00:00 [events_long/7]
root        59     2  0 Nov09 ?        00:00:00 [events_power_ef]
root        60     2  0 Nov09 ?        00:00:00 [events_power_ef]
root        61     2  0 Nov09 ?        00:00:00 [events_power_ef]
root        62     2  0 Nov09 ?        00:00:00 [events_power_ef]
root        63     2  0 Nov09 ?        00:00:00 [events_power_ef]
root        64     2  0 Nov09 ?        00:00:00 [events_power_ef]
root        65     2  0 Nov09 ?        00:00:00 [events_power_ef]
root        66     2  0 Nov09 ?        00:00:00 [events_power_ef]
root        67     2  0 Nov09 ?        00:00:00 [cgroup]
root        68     2  0 Nov09 ?        00:00:00 [khelper]
root        69     2  0 Nov09 ?        00:00:00 [netns]
root        70     2  0 Nov09 ?        00:00:00 [async/mgr]
root        71     2  0 Nov09 ?        00:00:00 [pm]
root        72     2  0 Nov09 ?        00:00:08 [sync_supers]
root        73     2  0 Nov09 ?        00:00:00 [bdi-default]
root        74     2  0 Nov09 ?        00:00:00 [kintegrityd/0]
root        75     2  0 Nov09 ?        00:00:00 [kintegrityd/1]
root        76     2  0 Nov09 ?        00:00:00 [kintegrityd/2]
root        77     2  0 Nov09 ?        00:00:00 [kintegrityd/3]
root        78     2  0 Nov09 ?        00:00:00 [kintegrityd/4]
root        79     2  0 Nov09 ?        00:00:00 [kintegrityd/5]
root        80     2  0 Nov09 ?        00:00:00 [kintegrityd/6]
root        81     2  0 Nov09 ?        00:00:00 [kintegrityd/7]
root        82     2  0 Nov09 ?        00:05:24 [kblockd/0]
root        83     2  0 Nov09 ?        00:00:18 [kblockd/1]
root        84     2  0 Nov09 ?        00:06:08 [kblockd/2]
root        85     2  0 Nov09 ?        00:00:18 [kblockd/3]
root        86     2  0 Nov09 ?        00:06:08 [kblockd/4]
root        87     2  0 Nov09 ?        00:00:16 [kblockd/5]
root        88     2  0 Nov09 ?        00:06:01 [kblockd/6]
root        89     2  0 Nov09 ?        00:00:14 [kblockd/7]
root        90     2  0 Nov09 ?        00:00:00 [kacpid]
root        91     2  0 Nov09 ?        00:00:00 [kacpi_notify]
root        92     2  0 Nov09 ?        00:00:00 [kacpi_hotplug]
root        93     2  0 Nov09 ?        00:00:00 [ata_aux]
root        94     2  0 Nov09 ?        00:00:00 [ata_sff/0]
root        95     2  0 Nov09 ?        00:00:00 [ata_sff/1]
root        96     2  0 Nov09 ?        00:00:00 [ata_sff/2]
root        97     2  0 Nov09 ?        00:00:00 [ata_sff/3]
root        98     2  0 Nov09 ?        00:00:00 [ata_sff/4]
root        99     2  0 Nov09 ?        00:00:00 [ata_sff/5]
root       100     2  0 Nov09 ?        00:00:00 [ata_sff/6]
root       101     2  0 Nov09 ?        00:00:00 [ata_sff/7]
root       102     2  0 Nov09 ?        00:00:00 [ksuspend_usbd]
root       103     2  0 Nov09 ?        00:00:00 [khubd]
root       104     2  0 Nov09 ?        00:00:00 [kseriod]
root       105     2  0 Nov09 ?        00:00:00 [md/0]
root       106     2  0 Nov09 ?        00:00:00 [md/1]
root       107     2  0 Nov09 ?        00:00:00 [md/2]
root       108     2  0 Nov09 ?        00:00:00 [md/3]
root       109     2  0 Nov09 ?        00:00:00 [md/4]
root       110     2  0 Nov09 ?        00:00:00 [md/5]
root       111     2  0 Nov09 ?        00:00:00 [md/6]
root       112     2  0 Nov09 ?        00:00:00 [md/7]
root       113     2  0 Nov09 ?        00:00:00 [md_misc/0]
root       114     2  0 Nov09 ?        00:00:00 [md_misc/1]
root       115     2  0 Nov09 ?        00:00:00 [md_misc/2]
root       116     2  0 Nov09 ?        00:00:00 [md_misc/3]
root       117     2  0 Nov09 ?        00:00:00 [md_misc/4]
root       118     2  0 Nov09 ?        00:00:00 [md_misc/5]
root       119     2  0 Nov09 ?        00:00:00 [md_misc/6]
root       120     2  0 Nov09 ?        00:00:00 [md_misc/7]
root       121     2  0 Nov09 ?        00:00:00 [linkwatch]
root       123     2  0 Nov09 ?        00:00:01 [khungtaskd]
root       124     2  0 Nov09 ?        00:10:01 [kswapd0]
root       125     2  0 Nov09 ?        00:00:00 [ksmd]
root       126     2  0 Nov09 ?        00:05:48 [khugepaged]
root       127     2  0 Nov09 ?        00:00:00 [aio/0]
root       128     2  0 Nov09 ?        00:00:00 [aio/1]
root       129     2  0 Nov09 ?        00:00:00 [aio/2]
root       130     2  0 Nov09 ?        00:00:00 [aio/3]
root       131     2  0 Nov09 ?        00:00:00 [aio/4]
root       132     2  0 Nov09 ?        00:00:00 [aio/5]
root       133     2  0 Nov09 ?        00:00:00 [aio/6]
root       134     2  0 Nov09 ?        00:00:00 [aio/7]
root       135     2  0 Nov09 ?        00:00:00 [crypto/0]
root       136     2  0 Nov09 ?        00:00:00 [crypto/1]
root       137     2  0 Nov09 ?        00:00:00 [crypto/2]
root       138     2  0 Nov09 ?        00:00:00 [crypto/3]
root       139     2  0 Nov09 ?        00:00:00 [crypto/4]
root       140     2  0 Nov09 ?        00:00:00 [crypto/5]
root       141     2  0 Nov09 ?        00:00:00 [crypto/6]
root       142     2  0 Nov09 ?        00:00:00 [crypto/7]
root       149     2  0 Nov09 ?        00:00:00 [kthrotld/0]
root       150     2  0 Nov09 ?        00:00:00 [kthrotld/1]
root       151     2  0 Nov09 ?        00:00:00 [kthrotld/2]
root       152     2  0 Nov09 ?        00:00:00 [kthrotld/3]
root       153     2  0 Nov09 ?        00:00:00 [kthrotld/4]
root       154     2  0 Nov09 ?        00:00:00 [kthrotld/5]
root       155     2  0 Nov09 ?        00:00:00 [kthrotld/6]
root       156     2  0 Nov09 ?        00:00:00 [kthrotld/7]
root       157     2  0 Nov09 ?        00:00:00 [pciehpd]
root       159     2  0 Nov09 ?        00:00:00 [kpsmoused]
root       160     2  0 Nov09 ?        00:00:00 [usbhid_resumer]
root       161     2  0 Nov09 ?        00:00:00 [deferwq]
root       194     2  0 Nov09 ?        00:00:00 [kdmremove]
root       195     2  0 Nov09 ?        00:00:00 [kstriped]
root       226     2  0 Nov09 ?        00:00:00 [ttm_swap]
postgres   315  2069  0 16:33 ?        00:00:04 postgres: nagiosxi nagiosxi ::1(51316) idle       
root       407     2  0 Nov09 ?        00:00:00 [scsi_eh_0]
root       408     2  0 Nov09 ?        00:00:00 [scsi_eh_1]
apache     411 24325  8 16:33 ?        00:01:48 /usr/sbin/httpd
root       497     2  0 Nov09 ?        00:00:48 [mpt_poll_0]
root       498     2  0 Nov09 ?        00:00:00 [mpt/0]
root       499     2  0 Nov09 ?        00:00:00 [scsi_eh_2]
postgres   564  2069  0 16:33 ?        00:00:03 postgres: nagiosxi nagiosxi ::1(51572) idle       
root       568     2  0 Nov09 ?        00:00:00 [kdmflush]
root       569     2  0 Nov09 ?        00:00:00 [kdmflush]
root       588     2  0 Nov09 ?        00:32:41 [jbd2/dm-1-8]
root       589     2  0 Nov09 ?        00:00:00 [ext4-dio-unwrit]
root       655  2219  0 16:55 ?        00:00:00 CROND
root       660  2219  0 16:55 ?        00:00:00 CROND
root       661  2219  0 16:55 ?        00:00:00 CROND
root       662  2219  0 16:55 ?        00:00:00 CROND
root       663  2219  0 16:55 ?        00:00:00 CROND
root       664  2219  0 16:55 ?        00:00:00 CROND
root       665  2219  0 16:55 ?        00:00:00 CROND
nagios     684   662  0 16:55 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php >> /usr/local/nagiosxi/var/event_handler.log 2>&1
nagios     688   660  0 16:55 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1
root       689   655  9 16:55 ?        00:00:01 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
root       691     1  0 Nov09 ?        00:00:00 /sbin/udevd -d
nagios     692   661  0 16:55 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios     694   663  0 16:55 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php >> /usr/local/nagiosxi/var/eventman.log 2>&1
nagios     695   665  0 16:55 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios     696   664  0 16:55 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
root       924     2  0 Nov09 ?        00:00:44 [vmmemctl]
nagios    1063 29133  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_icmp -H 10.146.146.254 -w 3000.0 80  -c 5000.0 100  -p 5
root      1224     2  0 Nov09 ?        00:02:46 [kauditd]
nagios    1274 29134  1 16:55 ?        00:00:00 /usr/bin/perl /usr/local/nagios/libexec/check_openmanage.pl -s -t 90 -H 10.6.33.30 -C openviewm -b bp=0
nagios    1276 29129  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.245.128.140 -C ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_CPU_Info --server 10.245.128.130 --host lisl-vesx-06-pp.snaponglobal.com
nagios    1289 29131  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H kenprod05g.snapon.com -p 5668 -t 70 -c check_total_zpool_io -a -w 80 -c 90
nagios    1294  1276  0 16:55 ?        00:00:00 /usr/bin/ssh -l vi-admin 10.245.128.140 ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_CPU_Info --server 10.245.128.130 --host lisl-vesx-06-pp.snaponglobal.com
root      1390     2  0 Nov09 ?        02:08:36 [flush-253:1]
root      1407     1  0 Nov22 ?        00:00:00 python /usr/local/bin/snmptraphandling.py 10.73.19.2 SNMP Traps Normal 1511402203 enterprises.9.2.1.5.0 ():10.73.104.59 enterprises.9.9.412.1.1.1.0 ():1 enterprises.9.9.412.1.1.2.0 ():10.73.104.59 An authenticationFailure trap signifies that the SNMP 1
nagios    1457 29131  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.245.128.140 -C ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_Status --server 10.245.128.130 --host lisl-vesx-17-pp.snaponglobal.com
nagios    1458  1457  0 16:55 ?        00:00:00 /usr/bin/ssh -l vi-admin 10.245.128.140 ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_Status --server 10.245.128.130 --host lisl-vesx-17-pp.snaponglobal.com
root      1468   689  5 16:55 ?        00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
root      1469   689  0 16:55 ?        00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
root      1470   689  1 16:55 ?        00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
root      1471   689 12 16:55 ?        00:00:00 /usr/bin/perl -w /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lib/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
smmsp     1473   655  0 16:55 ?        00:00:00 /usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t -f root
nagios    1475 29129  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisdbms14p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n lisdbms14p -m 100 -w 80 -c 90
nagios    1498 29134  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lishadb26p.snapon.com -u -p 5668 -t 100 -c check_zone_cpu -a -n lishadb26p -w 80 -c 90
root      1501     1  0 Nov09 ?        00:02:53 auditd
root      1523     1  0 Nov09 ?        00:06:13 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root      1559     1  0 Nov09 ?        00:01:43 irqbalance --pid=/var/run/irqbalance.pid
dbus      1578     1  0 Nov09 ?        00:00:01 dbus-daemon --system
root      1626     1  0 Nov09 ?        00:00:00 /usr/sbin/acpid
68        1638     1  0 Nov09 ?        00:00:11 hald
root      1639  1638  0 Nov09 ?        00:00:00 hald-runner
nagios    1643 29127  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisdbqy01p.snapon.com -u -p 5668 -t 100 -c check_zone_cpu -a -n lisdbqy01p -w 80 -c 90
root      1671  1639  0 Nov09 ?        00:00:00 hald-addon-input: Listening on /dev/input/event2 /dev/input/event0
68        1681  1639  0 Nov09 ?        00:00:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
nagios    1684 29136  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H kenprod05g.snapon.com -u -p 5668 -t 70 -c check_swap -a -w 50 -c 40
nagios    1693   694  5 16:55 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
postgres  1698  2069  0 16:55 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(51502) idle       
nagios    1715 29128  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisprod14g.snapon.com -p 5668 -t 70 -c check_total_zpool_io -a -w 80 -c 90
nagios    1724 29136  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H kendbdg06p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n kendbdg06p -w 80 -c 90
nagios    1772 29126  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H kendbdg22p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n kendbdg22p -w 80 -c 90
root      1803     1  0 Nov09 ?        00:06:09 /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid
nagios    1807 29130  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisachr03p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n lisachr03p -m 100 -w 80 -c 90
root      1814     1  0 Nov09 ?        00:29:42 /usr/sbin/snmptrapd -Lsd -On -p /var/run/snmptrapd.pid
root      1851     1  0 Nov09 ?        00:01:25 /usr/sbin/sshd
root      1862     1  0 Nov09 ?        00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
nagios    1892 29129  8 16:55 ?        00:00:00 /usr/bin/perl /usr/local/nagios/libexec/check_openmanage.pl -s -t 90 -H 10.9.129.113 -C openviewm -b bp=0
nagios    1893 29131  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisprod05g.snapon.com -u -p 5668 -t 70 -c check_cpu_stats -a -w :80 -c :90
nagios    1942 29127  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.0.18.111 -C ~/box293_check_vmware.pl --concurrent_checks 400 --username NagiosRO --check Host_Storage_Adapter_Info --server 10.0.18.128 --host keno-vesx-19-pp.snaponglobal.com
nagios    1943  1942  1 16:55 ?        00:00:00 /usr/bin/ssh -l vi-admin 10.0.18.111 ~/box293_check_vmware.pl --concurrent_checks 400 --username NagiosRO --check Host_Storage_Adapter_Info --server 10.0.18.128 --host keno-vesx-19-pp.snaponglobal.com
apache    1956 24325  5 14:29 ?        00:08:39 /usr/sbin/httpd
nagios    1962   688  9 16:55 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios    1965   696  9 16:55 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios    1967   684  8 16:55 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php
nagios    1970 29125  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_snmp -H 10.0.18.231 -t 60 -C Sn4ponCudaH -o .1.3.6.1.4.1.20632.5.7
nagios    1971   695 19 16:55 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios    1972   692  9 16:55 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
nagios    1974  1970  0 16:55 ?        00:00:00 /usr/bin/snmpget -Le -t 60 -r 5 -m  -v 1 -c             10.0.18.231:161 .1.3.6.1.4.1.20632.5.7
nagios    1975 29126  8 16:55 ?        00:00:00 /usr/bin/perl /usr/local/nagios/libexec/check_openmanage.pl -s -t 90 -H 10.9.129.45 -C openviewm -b bp=0
postgres  1988  2069  0 14:29 ?        00:00:20 postgres: nagiosxi nagiosxi ::1(38528) idle       
nagios    1993 29128  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.0.18.111 -C ~/box293_check_vmware.pl --concurrent_checks 400 --username NagiosRO --check Host_Memory_Usage --server 10.0.18.128 --host keno-vesx-20-pp.snaponglobal.com --warning memory
nagios    1995  1993  0 16:55 ?        00:00:00 /usr/bin/ssh -l vi-admin 10.0.18.111 ~/box293_check_vmware.pl --concurrent_checks 400 --username NagiosRO --check Host_Memory_Usage --server 10.0.18.128 --host keno-vesx-20-pp.snaponglobal.com --warning memory_used:409 --critical memory_used:460
postgres  1996  2069  0 16:55 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(51646) idle       
postgres  1999  2069  0 16:55 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(51650) idle       
postgres  2005  2069  0 16:55 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(51654) idle       
postgres  2016  2069  0 16:55 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(51656) idle       
postgres  2018  2069  0 16:55 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(51658) idle       
nagios    2039 29135  0 16:55 ?        00:00:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_nwc_health --hostname 10.6.84.1 --community Sn4p0nF1r3W4LL --mode cpu-load --units % --warning 80 --critical 90
nagios    2042 29136  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_by_ssh -E 1 -t 90 -l vi-admin -H 10.19.129.160 -C ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_vNIC_Status --server 10.19.129.150 --host conw-vesx-03-pp.snaponglobal.com
nagios    2043  2042  0 16:55 ?        00:00:00 /usr/bin/ssh -l vi-admin 10.19.129.160 ~/box293_check_vmware.pl --concurrent_checks 200 --check Host_vNIC_Status --server 10.19.129.150 --host conw-vesx-03-pp.snaponglobal.com
nagios    2056 29131  0 16:55 ?        00:00:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_nwc_health --hostname 10.160.19.2 --community Sn4p0nC0r3s --mode memory-usage --units % --warning 80 --critical 90
nagios    2065 29135  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H kendbdg24p.snapon.com -u -p 5668 -t 70 -c check_init_service -a svc:/network/sendmail-client:default
nagios    2066 29132  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisacic01p.snapon.com -p 5668 -t 70 -c check_cpu_latency -a -w 80 -c 90
nagios    2067 29130  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H kendbdg35p.snapon.com -u -p 5668 -t 70 -c check_init_service -a svc:/network/rpc/gss:default
postgres  2069     1  0 Nov09 ?        00:05:42 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
nagios    2070 29127  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lishadb24p.snapon.com -u -p 5668 -t 70 -c check_init_service -a svc:/network/sendmail-client:default
nagios    2071 29134  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisqpas01p.snapon.com -u -p 5668 -t 100 -c check_zone_mem -a -n lisqpas01p -m 100 -w 80 -c 90
nagios    2072  1971  0 16:55 ?        00:00:00 /bin/sh /etc/init.d/nagios status
nagios    2075  2232  0 16:55 ?        00:00:00 /usr/bin/perl /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1512514510.perfdata.service
nagios    2076  2232  0 16:55 ?        00:00:00 /usr/bin/perl /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1512514510.perfdata.host
nagios    2078 29129  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisebus02p.snapon.com -u -p 5668 -t 100 -c check_zone_cpu -a -n lisebus02p -w 80 -c 90
nagios    2086 29131  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisprod10g.snapon.com -u -p 5668 -t 70 -c check_services -a sendmail 2 approx
nagios    2087 29125  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisdbms11p.snapon.com -u -p 5668 -t 70 -c check_init_service -a svc:/system/filesystem/autofs:default
root      2094  2779  0 16:55 pts/0    00:00:00 ps -ef --cols=300
nagios    2095 29126  0 16:55 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -H lisprod05g.snapon.com -p 5668 -t 70 -c check_cpu_latency -a -w 80 -c 90
root      2121     1  0 Nov09 ?        00:02:34 sendmail: accepting connections
smmsp     2138     1  0 Nov09 ?        00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root      2159     1  0 Nov09 ?        00:00:00 /usr/sbin/abrtd
root      2190     1  0 Nov09 ?        00:01:00 abrt-dump-oops -d /var/spool/abrt -rwx /var/log/messages
root      2219     1  0 Nov09 ?        00:04:46 crond
nagios    2232     1  0 Nov09 ?        00:02:17 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
root      2246     1  0 Nov09 ?        00:00:00 /usr/sbin/atd
postgres  2287  2069  0 Nov09 ?        00:00:53 postgres: logger process                          
postgres  2289  2069  0 Nov09 ?        00:06:53 postgres: writer process                          
postgres  2290  2069  0 Nov09 ?        00:04:24 postgres: wal writer process                      
postgres  2291  2069  0 Nov09 ?        00:01:44 postgres: autovacuum launcher process             
postgres  2292  2069  0 Nov09 ?        00:11:37 postgres: stats collector process                 
ajaxterm  2293     1  0 Nov09 ?        00:09:21 python /usr/share/ajaxterm/ajaxterm.py --daemon --port=8022 --uid=ajaxterm
root      2374  1851  0 Nov09 ?        00:00:14 sshd: root@pts/0 
root      2776     1  0 Nov09 tty1     00:00:00 /sbin/mingetty /dev/tty1
root      2778     1  0 Nov09 tty2     00:00:00 /sbin/mingetty /dev/tty2
root      2779  2374  0 Nov09 pts/0    00:00:02 -bash
root      2783     1  0 Nov09 tty3     00:00:00 /sbin/mingetty /dev/tty3
root      2787     1  0 Nov09 tty4     00:00:00 /sbin/mingetty /dev/tty4
root      2790     1  0 Nov09 tty5     00:00:00 /sbin/mingetty /dev/tty5
root      2792     1  0 Nov09 tty6     00:00:00 /sbin/mingetty /dev/tty6
ntp       2902     1  0 Nov09 ?        00:00:19 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
apache    2990 24325  7 16:12 ?        00:03:08 /usr/sbin/httpd
postgres  3002  2069  0 16:12 ?        00:00:07 postgres: nagiosxi nagiosxi ::1(55182) idle       
apache    5178 24325  6 15:40 ?        00:05:03 /usr/sbin/httpd
postgres  5276  2069  0 15:40 ?        00:00:11 postgres: nagiosxi nagiosxi ::1(60134) idle       
apache   12659 24325  8 16:35 ?        00:01:38 /usr/sbin/httpd
postgres 12679  2069  0 16:35 ?        00:00:03 postgres: nagiosxi nagiosxi ::1(33482) idle       
apache   12781 24325  6 15:57 ?        00:03:59 /usr/sbin/httpd
postgres 12789  2069  0 15:57 ?        00:00:09 postgres: nagiosxi nagiosxi ::1(36776) idle       
apache   13431 24325  6 15:36 ?        00:05:22 /usr/sbin/httpd
postgres 13458  2069  0 15:36 ?        00:00:12 postgres: nagiosxi nagiosxi ::1(39648) idle       
apache   14080 24325  6 15:41 ?        00:05:02 /usr/sbin/httpd
postgres 14102  2069  0 15:41 ?        00:00:11 postgres: nagiosxi nagiosxi ::1(39668) idle       
apache   15268 24325  7 16:14 ?        00:02:55 /usr/sbin/httpd
postgres 15323  2069  0 16:14 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(37496) SELECT     
root     15448  1851  0 Nov27 ?        00:00:04 sshd: root@notty 
root     17580 15448  0 Nov27 ?        00:00:00 /usr/libexec/openssh/sftp-server
apache   17720 24325  8 16:52 ?        00:00:14 /usr/sbin/httpd
root     17860     1  0 Dec01 ?        00:06:23 /usr/bin/perl /usr/sbin/snmptt --daemon
postgres 17877  2069  0 16:52 ?        00:00:00 postgres: nagiosxi nagiosxi ::1(37480) idle       
apache   18344 24325  7 16:14 ?        00:02:54 /usr/sbin/httpd
postgres 18365  2069  0 16:14 ?        00:00:06 postgres: nagiosxi nagiosxi ::1(40124) idle       
root     18735     1  0 Nov10 pts/0    00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysql
mysql    18840 18735 22 Nov10 pts/0    5-16:40:07 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
nagios   18971     1  0 Nov10 ?        00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
apache   23378 24325  6 15:21 ?        00:06:03 /usr/sbin/httpd
postgres 23402  2069  0 15:21 ?        00:00:13 postgres: nagiosxi nagiosxi ::1(50274) idle       
root     24325     1  0 Nov27 ?        00:00:34 /usr/sbin/httpd
apache   24756 24325  6 15:32 ?        00:05:31 /usr/sbin/httpd
postgres 24826  2069  0 15:32 ?        00:00:12 postgres: nagiosxi nagiosxi ::1(49858) idle       
nagios   29122     1  2 08:28 ?        00:14:14 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   29125 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29126 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29127 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29128 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29129 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29130 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29131 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29132 29122  0 08:28 ?        00:00:37 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29133 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29134 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29135 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29136 29122  0 08:28 ?        00:00:38 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   29144 18971  0 08:28 ?        00:02:08 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios   29145 29144 12 08:28 ?        01:03:57 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios   29243 29122  0 08:28 ?        00:00:00 [nagios] <defunct>
apache   29484 24325  6 15:22 ?        00:06:01 /usr/sbin/httpd
postgres 29549  2069  0 15:22 ?        00:00:13 postgres: nagiosxi nagiosxi ::1(55384) idle       
apache   29820 24325  6 14:39 ?        00:08:08 /usr/sbin/httpd
postgres 29855  2069  0 14:39 ?        00:00:18 postgres: nagiosxi nagiosxi ::1(33082) idle       
root     30171     1  0 08:28 ?        00:00:32 /usr/bin/perl /usr/sbin/snmptt --daemon
apache   31886 24325  6 15:39 ?        00:05:08 /usr/sbin/httpd
postgres 31906  2069  0 15:39 ?        00:00:11 postgres: nagiosxi nagiosxi ::1(54986) idle       
root     32168     2  0 Nov28 ?        00:00:00 [bluetooth]
root     32181   691  0 Nov28 ?        00:00:00 /sbin/udevd -d
root     32182   691  0 Nov28 ?        00:00:00 /sbin/udevd -d
apache   32765 24325  8 16:33 ?        00:01:48 /usr/sbin/httpd
[root@lisl-ngos-01-pv snmptt]# ls -al /usr/local/nagios/var/
total 164392
drwxrwxr-x.  6 nagios nagios     4096 Dec  5 16:55 .
drwxr-xr-x. 10 root   root       4096 May 13  2014 ..
drwxrwxr-x.  2 nagios nagios    77824 Dec  4 23:59 archives
-rw-r--r--.  1 apache apache 50013986 May 11  2015 graphapi.log
-rw-r--r--.  1 nagios nagios     7978 Apr 25  2017 host-perfdata
-rw-r--r--.  1 nagios nagios    22137 Dec  5 10:20 nagios.configtest
-rw-r--r--.  1 nagios nagios        6 Dec  5 08:28 nagios.lock
-rw-rw-r--.  1 nagios nagios  3086564 Dec  5 16:55 nagios.log
-rw-rw-r--.  1 nagios users  19380236 Jan 15  2015 nagios.tmp1QpCnR
-rw-------.  1 nagios nagios  1323008 Jan 15  2015 nagios.tmpcY0Kxp
-rw-r--r--.  1 nagios nagios        6 Nov 10 09:34 ndo2db.lock
-rw-r--r--.  1 nagios nagios        0 Dec  5 08:28 ndomod.tmp
srwxr-xr-x.  1 nagios nagios        0 Nov 10 09:34 ndo.sock
-rw-r--r--.  1 nagios nagios  9385374 Nov 10 08:43 npcd.log
-rw-r--r--.  1 nagios nagios 10485817 Oct  1  2013 npcd.log.old
-rw-r--r--.  1 nagios nagios 13069207 Apr 25  2017 objects.cache
-rw-r--r--.  1 nagios nagios 13624873 Dec  5 10:20 objects.precache
-rw-rw-rw-.  1 nagios nagios  5524925 Nov 10 07:29 perfdata.log
-rw-------.  1 nagios nagios 20776831 Dec  5 16:28 retention.dat
-rw-------.  1 root   root   21174331 Nov  9 12:03 retention.dat.2017.11.9
drwxrwsr-x.  2 nagios nagcmd     4096 Dec  5 08:28 rw
-rw-r--r--.  1 nagios nagios   230597 Apr 25  2017 service-perfdata
drwxr-xr-x.  5 root   root       4096 Nov 13  2013 spool
drwxr-xr-x.  2 nagios nagios     4096 Dec  5 16:55 stats
[root@lisl-ngos-01-pv snmptt]# ls -al /usr/local/nagios/var/rw/
total 100
drwxrwsr-x. 2 nagios nagcmd  4096 Dec  5 08:28 .
drwxrwxr-x. 6 nagios nagios  4096 Dec  5 16:55 ..
-rw-rw-r--. 1 root   nagcmd   266 Dec  5 16:55 nagios.cmd
srw-rw----. 1 nagios nagcmd     0 Dec  5 08:28 nagios.qh
-rw-rw-r--. 1 nagios nagcmd 83082 Jul 27 19:12 nsca.dump
kyang

Re: Can't acknowledge alerts again

Post by kyang »

Were there any errors in the apache logs? Or any other relevant log errors you've seen?

How often has this been happening? Was it after any major upgrade or it just happens out of the blue?
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Can't acknowledge alerts again

Post by snapon_admin »

No I don't see anything in the apache error logs, and i'm not sure what other logs I should be checking. This happens fairly often, it's really kind of hard to pin down a pattern but this has been an ongoing issue for some time. Typically killing nagios and restarting it or applying a new config will fix the issue but not this time. Even when it does fix the issue this is a fairly annoying issue because I'm the only Nagios admin here so if operations can't acknowledge alerts at 2 in the morning I get a phone call about it since I'm the only one that can fix it. I don't know what causes it but I'd really like to understand more about why this happens so often and how to prevent it from happening. This was not after a major upgrade, no.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: Can't acknowledge alerts again

Post by WillemDH »

Hello,

Seems you experience similar issues as I do sometimes... Restarting nagios service helps for some time. See thread https://support.nagios.com/forum/viewto ... 16&t=44467

Did you try the suggestions in https://support.nagios.com/kb/article/n ... d-139.html yet?

Grtz

Willem
Nagios XI 5.8.1
https://outsideit.net
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Can't acknowledge alerts again

Post by snapon_admin »

I have not tried that yet, I'll take a look. Yeah, it definitely seems like we're experiencing the same issue here, and it's one of the most annoying issues I think I've experienced using Nagios. I'll take a look at that page you linked, thanks Willem.
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Can't acknowledge alerts again

Post by snapon_admin »

I just looked and the values on those conf files are already higher on my server so that's not it.

Code: Select all

[root@lisl-ngos-01-pv httpd]# grep 'kernel.msgmnb' /etc/sysctl.conf
grep 'kernel.msgmax' /etc/sysctl.conf
kernel.msgmnb = 524288000
[root@lisl-ngos-01-pv httpd]# grep 'kernel.msgmax' /etc/sysctl.conf
kernel.msgmax = 524288000
[root@lisl-ngos-01-pv httpd]# grep 'kernel.msgmni' /etc/sysctl.conf
kernel.msgmni = 512000
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Can't acknowledge alerts again

Post by dwhitfield »

Are you able to acknowledge in the core UI, or not at all?

Is the following still essentially accurate?
Total Hosts: 1219
Total Services: 13344

Are most of your check intervals still 5 minutes?

In your ndo2db.cfg the old profile I have has your debug_levelset to 0. Please set that to -1, restart ndo2db, and let it run for a bit and then send /usr/local/nagios/var/ndo2db.debug

Depending on what sort of log rotation and space you have, you may want to turn that off. It shouldn't get bigger than max_debug_file_size=1000000.

It might be useful to get a profile. You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.

You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh

That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.

After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.

If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Can't acknowledge alerts again

Post by snapon_admin »

Can't acknowledge at all. Yeah, pretty similar 1225 hosts, 13370 services now. Where is that config file located? Profile is too big (2.5M) to attach to PM.
Locked