Page 1 of 1

Can't acknowledge/comment

Posted: Tue Sep 23, 2014 1:57 pm
by snapon_admin
No one can acknowledge alerts, disable notifications, ,submit passive check results, or add comments all of a sudden. I have run the repair_databases script, restated the nagios service, as well as httpd. Is there anything else I can look at/restart for this? Any ideas why it's not working?

Re: Can't acknowledge/comment

Posted: Tue Sep 23, 2014 2:57 pm
by lmiltchev
Try acknowledge alert or disable notifications, then check the apache error log for errors:

Code: Select all

tail -50 /var/log/httpd/error_log
Also, run the following commands and show us the output:

Code: Select all

grep check_external /usr/local/nagios/etc/nagios.cfg
ll /usr/local/nagios/var/rw/
ll /usr/local/nagios/var/rw/nagios.cmd
grep nag /etc/group
chage -l nagios
chage -l apache
BTW, have you tried restarting the server?

Re: Can't acknowledge/comment

Posted: Tue Sep 23, 2014 3:12 pm
by snapon_admin
I applied a config in CCM (had to modify a threshold on something) and decided to test it just to see, and it's working again. I'm still curious as to what caused the issue in the first place. No, I did not try restarting the server. I like to avoid that whenever possible. Here's those outputs, ran right after it started working so I don't know if it'll help at all or not.

Code: Select all

[root@lisl-ngos-01-pv scripts]# tail -50 /var/log/httpd/error_log
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.2 -> 1000 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.2 -> 154754540
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.3 -> 100 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.3 -> 138283192
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.4 -> 100 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.4 -> 60107152
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.5 -> 100 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.5 -> 17466368
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.6 -> 100 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.6 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.7 -> 100 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.7 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.8 -> 100 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.8 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.9 -> 100 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.9 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.10 -> 100 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.10 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.11 -> 1000 Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.11 -> 197912360
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.12 -> unknown Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.12 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.13 -> unknown Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.13 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.14 -> unknown Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.14 -> 80951572
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.15 -> unknown Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.15 -> 39756558
--base: check for HighspeedCounters failed ... Dropping back to V1
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHighSpeed.16 -> unknown Mb/s
--base: snmpget <REDACTED>@10.22.250.1:::::2:v4only for ifHCInOctets.16 -> unknown
--base: check for HighspeedCounters failed ... Dropping back to V1
Use of uninitialized value $dir in concatenation (.) or string at /usr/bin/mrtg line 2658.
Use of uninitialized value $dir in concatenation (.) or string at /usr/bin/mrtg line 2676.
Use of uninitialized value $dir in concatenation (.) or string at /usr/bin/mrtg line 2692.
ERROR: "WorkDir" not specified in mrtg config file
[Tue Sep 23 13:14:14 2014] [notice] caught SIGTERM, shutting down
[Tue Sep 23 13:14:16 2014] [notice] SELinux policy enabled; httpd running as context unconfined_u:system_r:httpd_t:s0
[Tue Sep 23 13:14:16 2014] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Tue Sep 23 13:14:17 2014] [notice] Digest: generating secret for digest authentication ...
[Tue Sep 23 13:14:17 2014] [notice] Digest: done
[Tue Sep 23 13:14:17 2014] [notice] Apache/2.2.15 (Unix) DAV/2 PHP/5.3.3 mod_ssl/2.2.15 OpenSSL/1.0.1e-fips mod_wsgi/3.2 Python/2.6.6 configured -- resuming normal operations

[root@lisl-ngos-01-pv scripts]# grep check_external /usr/local/nagios/etc/nagios.cfg
check_external_commands=1

[root@lisl-ngos-01-pv scripts]# ll /usr/local/nagios/var/rw/
ll /usr/local/nagios/var/rw/nagios.cmd
total 0
prw-rw----. 1 nagios nagcmd 0 Sep 23 15:05 nagios.cmd
srw-rw----. 1 nagios nagcmd 0 Sep 23 14:56 nagios.qh
[root@lisl-ngos-01-pv scripts]# ll /usr/local/nagios/var/rw/nagios.cmd
prw-rw----. 1 nagios nagcmd 0 Sep 23 15:05 /usr/local/nagios/var/rw/nagios.cmd

[root@lisl-ngos-01-pv scripts]# grep nag /etc/group
nagios:x:500:nagios,apache
nagcmd:x:501:nagios,apache

[root@lisl-ngos-01-pv scripts]# chage -l nagios
chage -l apacheLast password change                                     : Jun 06, 2012
Password expires                                        : never
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 99999
Number of days of warning before password expires       : 7

[root@lisl-ngos-01-pv scripts]# chage -l apache
Last password change                                    : Jun 06, 2012
Password expires                                        : never
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : -1
Maximum number of days between password change          : -1
Number of days of warning before password expires       : -1

Re: Can't acknowledge/comment

Posted: Tue Sep 23, 2014 3:46 pm
by abrist
All of the issues you mentioned relate to the command pipe. If a second nagios parent process fails to start, you may see your command pipe disappear. In these rare instances, stop any other nagios processes, try to kill them for good measure, and then restart:

Code: Select all

ls -la /usr/local/nagios/var/rw/nagios.cmd
service nagios stop
killall nagios
ps -aef | grep nagios.cfg
service nagios start
ls -la /usr/local/nagios/var/rw/nagios.cmd

Re: Can't acknowledge/comment

Posted: Wed Sep 24, 2014 8:07 am
by snapon_admin
Noted, I will definitely try that when/if this happens again. Thanks!

Re: Can't acknowledge/comment

Posted: Wed Sep 24, 2014 10:29 am
by lmiltchev
I am glad the issue is resolved! If it reappears, start a new thread. Locking it down.