running nagios XI 5.4.11 and nagios core 4.2.4 and seeing the below messages in the message file and nagios.log, and the nagios worker processes die.
messages-20171210:Dec 9 18:04:24 xxxxxxxxx nagios: wproc: 'Core Worker 12494' seems to be choked. ret = -1; bufsize = 1680846: written = 730880; errno = 32 (Broken pipe)
messages-20171210:Dec 9 18:04:24 xxxxxxxxxx nagios: wproc: 'Core Worker 12484' seems to be choked. ret = -1; bufsize = 898389: written = 730880; errno = 32 (Broken pipe)
messages-20171210:Dec 9 18:09:22 xxxxxxx nagios: wproc: 'Core Worker 12489' seems to be choked. ret = -1; bufsize = 898390: written = 730880; errno = 32 (Broken pipe)
Dec 12 17:30:11 xxxxxxx nagios: wproc: 'Core Worker 21066' seems to be choked. ret = -1; bufsize = 898391: written = 730880; errno = 32 (Broken pipe)
Dec 12 17:30:11 xxxxxxxxxx nagios: Unable to run check for service 'VM Status for VMHost' on host 'xxxxxxxxxxxxx'
Dec 12 17:30:11 xxxxxxxxxxx nagios: wproc: iocache_read() from Core Worker 21066 returned -1: Connection reset by peer
Dec 12 17:30:11 xxxxxxxxxxx nagios: wproc: Socket to worker Core Worker 21066 broken, removing
[1513036836] wproc: SERVICE EVENTHANDLER job 1837 from worker Core Worker 30979 is a non-check helper but exited with return code 2
[1513036836] wproc: early_timeout=0; exited_ok=1; wait_status=512; error_code=0;
[1513036836] wproc: stderr line 01: execvp(/usr/local/nagios/libexec/eventhandlers/update-nagex, ...) failed. errno is 2: No such file or directory
[1513036850] wproc: 'Core Worker 30965' seems to be choked. ret = -1; bufsize = 898391: written = 730880; errno = 32 (Broken pipe)
[1513036850] Unable to run check for service 'VM Status for VMHost' on host 'xxxxxxxxxxxxxxx'
[1513036850] wproc: iocache_read() from Core Worker 30965 returned -1: Connection reset by peer
[1513036850] wproc: Socket to worker Core Worker 30965 broken, removing
[1513101005] wproc: 'Core Worker 21050' seems to be choked. ret = -1; bufsize = 898392: written = 730880; errno = 32 (Broken pipe)
[1513101005] Unable to run check for service 'VM Status for VMHost' on host 'xxxxxxx'
[1513101005] wproc: iocache_read() from Core Worker 21050 returned -1: Connection reset by peer
[1513101005] wproc: Socket to worker Core Worker 21050 broken, removing
[1513101028] Warning: The check of service 'Disk Usage:/usr /var /opt /home /boot /tmp /' on host 'xxxxxxxx' looks like it was orphaned (results never came back; last_check=1513099463; next_check=1513100025). I'm scheduling an immediate check of the service...
nagios 21042 1 1 17:29 ? 00:01:23 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 21044 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21045 21042 0 17:29 ? 00:00:02 [nagios] <defunct>
nagios 21046 21042 0 17:29 ? 00:00:01 [nagios] <defunct>
nagios 21047 21042 0 17:29 ? 00:00:01 [nagios] <defunct>
nagios 21048 21042 0 17:29 ? 00:00:01 [nagios] <defunct>
nagios 21050 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21051 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21052 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21053 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21054 21042 0 17:29 ? 00:00:01 [nagios] <defunct>
nagios 21055 21042 0 17:29 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 21056 21042 0 17:29 ? 00:00:01 [nagios] <defunct>
nagios 21057 21042 0 17:29 ? 00:00:02 [nagios] <defunct>
nagios 21058 21042 0 17:29 ? 00:00:03 [nagios] <defunct>
nagios 21059 21042 0 17:29 ? 00:00:01 [nagios] <defunct>
nagios 21060 21042 0 17:29 ? 00:00:02 [nagios] <defunct>
nagios 21061 21042 0 17:29 ? 00:00:02 [nagios] <defunct>
nagios 21062 21042 0 17:29 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 21063 21042 0 17:29 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 21064 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21065 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21066 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21068 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21069 21042 0 17:29 ? 00:00:00 [nagios] <defunct>
nagios 21382 21042 0 17:29 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Nagios Worker Processes Die
Re: Nagios Worker Processes Die
I see that you opened a support ticket. Shall we close this post so we can continue with the ticket?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Nagios Worker Processes Die
Yes, the post can be closed. Thanks!