Hi -
Looks like this error is coming back again (nagios: wproc: 'Core Work XXXX' seems to be choked).
Edit: http://support.nagios.com/forum/viewtop ... 12#p125712
The difference is that this time, my cpu utilization is low (around 25% average, 40% max in top - previously it was running full tilt). The system is a 6 core cpu. I added 'check_workers=25' in the nagios.cfg and that seemed to help.
Is this the right fix? I *think* it started after I updated to the latest version (R2.6). Was it because I added more checks?
nagios: wproc: 'Core Work XXXX' seems to be choke
-
tonyleatwork
- Posts: 91
- Joined: Mon Jul 07, 2014 8:55 am
Re: nagios: wproc: 'Core Work XXXX' seems to be choke
Possibly. How many checks (per 5 minutes) are currently being executed?tonyleatwork wrote:Was it because I added more checks?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
tonyleatwork
- Posts: 91
- Joined: Mon Jul 07, 2014 8:55 am
Re: nagios: wproc: 'Core Work XXXX' seems to be choke
Upping the workers to 25 increased CPU load (its spiking to 100% from time to time, user + system but still manageable) got rid of the WPROC issues.
Under Monitoring Engine Status:
Active Host Checks
1-min 71
5-min 357
15-min 484
Passive Host Checks
1-min 0
5-min 0
15-min 0
Active Service Checks
1-min 470
5-min 2,530
15-min 3,680
Passive Service Checks
1-min 0
5-min 0
15-min 0
Under Monitoring Engine Status:
Active Host Checks
1-min 71
5-min 357
15-min 484
Passive Host Checks
1-min 0
5-min 0
15-min 0
Active Service Checks
1-min 470
5-min 2,530
15-min 3,680
Passive Service Checks
1-min 0
5-min 0
15-min 0
Re: nagios: wproc: 'Core Work XXXX' seems to be choke
Yes, increasing the check_workers to 25 was the right idea. You should see a smoother line on the event queue graph. (your checks are spread out even in the interval)
Look at the Monitoring Engine Status page: Admin --> System Information --> Monitoring Engine Status
and pay attention to the host and service check latency to see if your checks are waiting too long to run.
Look at the Monitoring Engine Status page: Admin --> System Information --> Monitoring Engine Status
and pay attention to the host and service check latency to see if your checks are waiting too long to run.