Nagios Worker 100 CPU - Nagios hangs

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Mike-sbg
Posts: 37
Joined: Mon Jun 01, 2015 1:05 am

Nagios Worker 100 CPU - Nagios hangs

Post by Mike-sbg »

Hello!

Im Using Nagios 4.3.2 on den CENTOS 7 and I've a strange behaviour, which I read, that it is allready fixed:

One of the worker threaks uses 100 CPU's and the hole Nagios Process stopps working.
Only Restarting Nagios fixes the problem for a view hours.

The strange thing is, that this server was in testmode (behing an Nagios 3 Server connect with OCSP) - everything worked fine for some months.

I exprimented a little bit the the workers= Setting in Nagios.cfg and found out, when I reduce the workers to 4 ... the crash problem appears evry few minutes.
With setting workers=9 it happens 1-2 times a day

A the Momant im Testing with more workers.

Can anybody help me...
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios Worker 100 CPU - Nagios hangs

Post by tgriep »

Can you post your nagios.cfg file so we can the settings?
Also, when the system hangs, can you post the nagios.log file and if you have syslog enabled in the nagios.cfg file, post any entries in the /var/log/messages file.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Mike-sbg
Posts: 37
Joined: Mon Jun 01, 2015 1:05 am

Re: Nagios Worker 100 CPU - Nagios hangs

Post by Mike-sbg »

The Nagios.log prints nothing special when the problem ocurs ... it just freezes ...

I looked a the abrt-log and the only thing I saw, was that the check_dns produces

Code: Select all

Oct  5 07:52:55 atnagiossr03 abrt-hook-ccpp: Process 16150 (check_dns) of user 993 killed by SIGSEGV - dumping core
Oct  5 07:52:55 atnagiossr03 abrt-server: Executable '/usr/lib64/nagios/plugins/check_dns' doesn't belong to any package and ProcessUnpackaged is set to 'no'
Oct  5 07:52:55 atnagiossr03 abrt-server: 'post-create' on '/var/spool/abrt/ccpp-2017-10-05-07:52:55-16150' exited with 1
Oct  5 07:52:55 atnagiossr03 abrt-server: Deleting problem directory '/var/spool/abrt/ccpp-2017-10-05-07:52:55-16150'
I disabled the check for testing, because I don't know if it is part of the problem ...

The Nagios.cfg is attached to this thread.
Attachments
nagios.cfg
(4.71 KiB) Downloaded 566 times
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios Worker 100 CPU - Nagios hangs

Post by tgriep »

The core dump for the check_dns plugin could be what is causing the hang as the system may not be able to save the dump file causing the worker to hang.
If disabling that check keeps the system from hanging, then you may need to enable debugging in Nagios to see if you can get more details on the issue in the nagios.debug file.

Also, make sure the ulimit settings on the server is set to unlimited so the core dump file can be created if it happens again.
That setting is found in this file.

Code: Select all

/etc/security/limits.conf
Be sure to check out our Knowledgebase for helpful articles and solutions!
Mike-sbg
Posts: 37
Joined: Mon Jun 01, 2015 1:05 am

Re: Nagios Worker 100 CPU - Nagios hangs

Post by Mike-sbg »

I disabled the CHECK_DNS totally and no abrts happen any more.

But nagios still hung 2 times the last 12 hours.

Thank you for the advice with the limits.conf I set them to unlimited ... I hope this fixes the problem.

I'll report again on monday ...
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios Worker 100 CPU - Nagios hangs

Post by tgriep »

OK, let us know your findings.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Mike-sbg
Posts: 37
Joined: Mon Jun 01, 2015 1:05 am

Re: Nagios Worker 100 CPU - Nagios hangs

Post by Mike-sbg »

Now Nagios stayed stable the whole last weekend.

The adivce with the ulimits fixed my problem.

I'm very happy!

Thank you for helping!!!
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios Worker 100 CPU - Nagios hangs

Post by tgriep »

Your very welcome. I'll close and mark the post as solved. If you have any issues in the future, feel free to open a new post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked