Nagios Support Forum

Posted: **Tue Dec 16, 2014 6:14 am**

Hello everybody,

I am using nagioslog in cluster with two instances (64-bit ova)

Every morning, my Dashboard is empty and I must restart servers for rebuild index. On the web interface, service are UP then when I refresh page they are DOWN…

On ssh I got this :
[root@nagioslog ~]# service logstash status
Log stash Daemon dead but pid file exists

To resolve problem temporarily, I have created crontab which restart my servers every 5 hours. That seems resolve my problem, but I don’t think is the best solution…

Thanks for your help

PS:
For info, actually there are 72 hosts and 6,000,000 Docs per days.

Posted: **Tue Dec 16, 2014 4:09 pm**

What sort of CPU, memory, and disk space are available on the nodes? Are you hitting limits or high usage on any of those metrics?

Posted: **Wed Dec 17, 2014 4:48 am**

Hello,

VM nagioslog are working on ESX Intel Xeon 2 GHz, using 4vCPU, 4 Go & 1 file ".vmdk" of 100 Go (actually 50G used) for both.

Before, VM had 1 vCPU, since I add 3 vCPU, I don't hit limits.

Thanks for your help

Posted: **Wed Dec 17, 2014 10:47 am**

Are you saying you increased the limits and the problem went away? Or that you weren't hitting limits before and you still aren't now but the problem persists?

Posted: **Wed Dec 17, 2014 3:33 pm**

Sorry for my english (I’m French

)

Each morning, when I saw my Dashboard was empty, the web interface was very slow and CPU was 100%.
Then I add 3vCPU, and each morning the same problem (Dashboard empty, web interface slow) but CPU consume 157% of 400%

But the problem persists, except when I restart server.

Posted: **Wed Dec 17, 2014 4:02 pm**

How much memory does this server have?
Does it have fast disks installed (e.g. SSD's)?
How much data per day are you sending this server?
Is this just a single server cluster, or do you have multiple machines sharing the workload?

Posted: **Thu Dec 18, 2014 2:32 am**

How much memory does this server have?

4GB of memory

Does it have fast disks installed (e.g. SSD's)?

No fast disk, only 7200 RPM

How much data per day are you sending this server?

We are sending between 6,100,000 & 6,800,000 Documents per day
In Primary Size it's between 2.8GB & 3,1GB

Is this just a single server cluster, or do you have multiple machines sharing the workload?

We have a cluster with 2 instances, with DNS round robin.
All instances have same capacity (memory, Disk, vCPU).

Posted: **Thu Dec 18, 2014 2:49 pm**

Make sure each server in the cluster is running the "elasticsearch" and "logstash" services:

Code: Select all

service elasticsearch status
service logstash status

If either of those are not running, please start them and wait a few minutes.

Posted: **Wed Dec 24, 2014 6:27 am**

tmcdonald wrote:Make sure each server in the cluster is running the "elasticsearch" and "logstash" services:
Code: Select all
service elasticsearch status
service logstash status
If either of those are not running, please start them and wait a few minutes.

When I execute command status is OK.
I monitor these service with Nagios XI.

If I don't restart servers nagioslog, total process will increases (500 process in 3 days), 100% of 400% CPU use by Java & and web interface very slow, but service elastic search and logstash are Ok with command service elasticsearch status.

I hope you understand my problem :/

Posted: **Fri Dec 26, 2014 9:32 am**

chris_Espoir wrote:total process will increases (500 process in 3 days),

Would it be possible for us to get a list of the processes running when this happens?

Code: Select all

ps -ef

Nagios Support Forum

Problem index log every day

Problem index log every day

Re: Problem index log every day

Re: Problem index log every day

Re: Problem index log every day

Re: Problem index log every day

Re: Problem index log every day

Re: Problem index log every day

Re: Problem index log every day

Re: Problem index log every day

Re: Problem index log every day