Number of Nagios workers causing interruptions

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
reincarne
Posts: 146
Joined: Wed Jun 26, 2013 4:39 am

Number of Nagios workers causing interruptions

Post by reincarne »

Hi,
I noticed that at least once in a week my Nagios XI stops functioning and the only way to resolve it is to kill the Nagios workers.

In a normal day work, there are 12 workers. However, during the week they are growing, everytime there are 12 new workers created with a new ID.
what is the reason and how can I solve it?

nagios 5787 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5788 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5789 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5791 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5792 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5793 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5794 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5795 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5796 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5797 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5798 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 5799 5785 0 08:46 ? 00:00:02 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
alexle 20679 3054 0 08:52 pts/2 00:00:00 grep worker
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Number of Nagios workers causing interruptions

Post by avandemore »

Killing the workers isn't the right way, they are likely needed.

Can you describe in more detail by this: my Nagios XI stops functioning?

A profile generated in one of this suboptimal states may be useful as well if it is possible.

XI > Admin > System Profile > Download Profile

Please include the zip file in your response. You can PM myself or other support personnel it as well.
Previous Nagios employee
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Number of Nagios workers causing interruptions

Post by tmcdonald »

Just checking in since we have not heard from you in a while. Did @avandemore's post clear things up or has the issue otherwise been resolved?
Former Nagios employee
reincarne
Posts: 146
Joined: Wed Jun 26, 2013 4:39 am

Re: Number of Nagios workers causing interruptions

Post by reincarne »

Hi,
We are still experiencing this issue. To whom should I send the profile?
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Number of Nagios workers causing interruptions

Post by tmcdonald »

Please send it to me and make sure to reply back to this thread once you have done so.

Update: Profile received and shared with team.
Former Nagios employee
reincarne
Posts: 146
Joined: Wed Jun 26, 2013 4:39 am

Re: Number of Nagios workers causing interruptions

Post by reincarne »

tmcdonald wrote:Please send it to me and make sure to reply back to this thread once you have done so.
Hi,
Send you a PM.
kyang

Re: Number of Nagios workers causing interruptions

Post by kyang »

Looking into your profile, the logs date back to June 20th? Could you send us an updated profile?

From your top command, you are experiencing 100% CPU usage from mysqld.

How many hosts/services do you have?

Did you offload your database?

The best thing would be to send us an updated profile, so we can see how things are looking now.
reincarne
Posts: 146
Joined: Wed Jun 26, 2013 4:39 am

Re: Number of Nagios workers causing interruptions

Post by reincarne »

sent you the profile file
kyang

Re: Number of Nagios workers causing interruptions

Post by kyang »

How many hosts/services do you have?

Is this an offloaded DB?

UPDATE: Profile received!

Put in teamshare.
reincarne
Posts: 146
Joined: Wed Jun 26, 2013 4:39 am

Re: Number of Nagios workers causing interruptions

Post by reincarne »

We have about 1700 hosts and 31k service checks.
Offloading the DB caused more issues.
Locked