nagios: wproc: Core Worker 35151: job 15496 (pid=35200) tim
Posted: Fri Aug 11, 2017 7:51 am
Nagios version 5.3.2
Redhat 7.0
My Nagios instance hung and drove the host load to over 2000. Rebooting the host resolved the issue but what caused the problem in the first place? The last service checks seem to have happened at 22:36, at 22:33 the message was written to the OS logs:
Aug 10 22:33:00 nagios: wproc: Core Worker 35151: job 15496 (pid=35200) timed out. Killing it
Aug 10 22:33:00 nagios: wproc: CHECK job 15496 from worker Core Worker 35151 timed out after 30.02s
Aug 10 22:33:00 nagios: wproc: host=<<hostname>>; service=(null);
Aug 10 22:33:00 nagios: wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Aug 10 22:33:00 nagios: Warning: Check of host 'is-backup700' timed out after 30.02 seconds
Aug 10 22:33:00 nagios: wproc: Core Worker 35151: job 15496 (pid=35200): Dormant child reaped
immediately after those messages I get a steady flow (16 or more per minute, all starting the first second of that minute) of these:
Aug 10 22:33:01 systemd: Starting Session 1203871 of user nagios.
Redhat 7.0
My Nagios instance hung and drove the host load to over 2000. Rebooting the host resolved the issue but what caused the problem in the first place? The last service checks seem to have happened at 22:36, at 22:33 the message was written to the OS logs:
Aug 10 22:33:00 nagios: wproc: Core Worker 35151: job 15496 (pid=35200) timed out. Killing it
Aug 10 22:33:00 nagios: wproc: CHECK job 15496 from worker Core Worker 35151 timed out after 30.02s
Aug 10 22:33:00 nagios: wproc: host=<<hostname>>; service=(null);
Aug 10 22:33:00 nagios: wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Aug 10 22:33:00 nagios: Warning: Check of host 'is-backup700' timed out after 30.02 seconds
Aug 10 22:33:00 nagios: wproc: Core Worker 35151: job 15496 (pid=35200): Dormant child reaped
immediately after those messages I get a steady flow (16 or more per minute, all starting the first second of that minute) of these:
Aug 10 22:33:01 systemd: Starting Session 1203871 of user nagios.