I recently compiled Nagios 4.1.1 Core and Nagios Plugins 2.1.1 on Solaris 11.3 SPARC. The compilation went fine so far but when I start Nagios using the Service Management Facility (SMF) it goes from "offline*" to "maintenance" after a while with this error displayed in the logfile
Code: Select all
Failed to connect to query socket '/opt/nagios/var/rw/nagios.qh': connect() failed: Connection refusedls -l /opt/nagios/var/rw/
total 2
-rw-r--r-- 1 nagios nagios 0 May 9 15:07 nagios.cmd
srw-rw---- 1 nagios nagios 0 May 9 15:09 nagios.qh
I'm able to access the webinterface but the output of "ps -ef | grep nagios" shows that there are a lot of defunctional processes
svcs -xv nagios:
State: offline* transitioning to online since May 9, 2016 03:08:00 PM CEST
Reason: Start method is running.
See: http://support.oracle.com/msg/SMF-8000-C4
See: /var/svc/log/application-nagios:default.log
Impact: This service is not running.
partitial output of logfile:
[ May 9 13:09:00 Executing start method ("/opt/nagios/bin/nagios /opt/nagios/etc/nagios.cfg"). ]
nerd: Channel hostchecks registered successfully
nerd: Channel servicechecks registered successfully
nerd: Channel opathchecks registered successfully
nerd: Fully initialized and ready to rock!
wproc: Successfully registered manager as @wproc with query handler
Failed to connect to query socket '/opt/nagios/var/rw/nagios.qh': connect() failed: Connection refused
[...]
wproc: Registry request: name=Core Worker 22091;pid=22091
Error: Could not create external command file '/opt/nagios/var/rw/nagios.cmd' as named pipe: (17) -> File exists. If this file already exists and you are sure that another copy of Nagios is not running, you should delete this file.
ps -ef | grep nagios:
nagios 22164 22134 0 - ? 0:00 <defunct>
nagios 22165 22134 0 15:10:02 ? 0:00 /opt/nagios/bin/nagios --worker /opt/nagios/var/rw/nagios.qh
ps -ef | grep nagios | grep "<defunct>" | wc -l:
28
Thank you very much,
Daniel