Page 1 of 1

My Nagios Log Server stop logging every days

Posted: Thu Mar 09, 2017 11:24 am
by juanmafer
Hi!
My Nagios Log Server stop logging, but the logstash service is on.
If I restart the logstash service, the system come back to log.
Any idea?

Thanks!

Here are our logs (attach the logstash log):
[root@localhost ~]# free -h
total used free shared buffers cached
Mem: 15G 13G 1.8G 244K 314M 3.5G
-/+ buffers/cache: 9.9G 5.5G
Swap: 255M 0B 255M

top - 13:13:17 up 6 days, 1:25, 1 user, load average: 1.08, 1.79, 1.21
Tasks: 103 total, 1 running, 102 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.6%us, 1.8%sy, 4.0%ni, 85.6%id, 3.8%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 16181984k total, 14333824k used, 1848160k free, 322364k buffers
Swap: 262136k total, 0k used, 262136k free, 3622732k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27191 nagios 20 0 13.8g 8.3g 139m S 88.7 54.1 254:37.62 java
23634 root 39 19 1569m 305m 14m S 5.9 1.9 1:47.85 java
341 root 20 0 0 0 0 S 3.9 0.0 278:15.95 jbd2/sda1-8
1 root 20 0 19232 1520 1232 S 0.0 0.0 0:01.02 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 0:20.35 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
6 root RT 0 0 0 0 S 0.0 0.0 169:18.17 watchdog/0
7 root 20 0 0 0 0 S 0.0 0.0 5:15.32 events/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns

Re: My Nagios Log Server stop logging every days

Posted: Thu Mar 09, 2017 3:07 pm
by mcapra
Can you run the following commands and share the resulting /tmp/42881_1.zip file:

Code: Select all

zip -r /tmp/42881_1.zip /var/log/elasticsearch/*
zip -r /tmp/42881_1.zip /var/log/logstash/*
zip -r /tmp/42881_1.zip /usr/local/nagioslogserver/logstash/etc/conf.d/*

Re: My Nagios Log Server stop logging every days

Posted: Thu Mar 09, 2017 3:25 pm
by juanmafer
Attached!
The last down was today 2:09 am

Re: My Nagios Log Server stop logging every days

Posted: Thu Mar 09, 2017 6:03 pm
by mcapra
Can you share the output of the following command executed from the CLI of any of your Nagios Log Server machines:

Code: Select all

curl -XGET localhost:9200/_nodes/jvm?pretty

Re: My Nagios Log Server stop logging every days

Posted: Fri Mar 10, 2017 10:35 am
by juanmafer
*Before made this post I upgraded the platform to v1.4.4 with the same results...

[root@localhost ~]# curl -XGET localhost:9200/_nodes/jvm?pretty
{
"cluster_name" : "9663e92a-f4e1-4499-a1a3-439b895b37d5",
"nodes" : {
"6jrUN3t_Toen5yIfoO-JBw" : {
"name" : "b86d74f4-36c6-44a8-93a2-ea5256059b39",
"transport_address" : "inet[/172.31.104.101:9300]",
"host" : "localhost",
"ip" : "127.0.0.1",
"version" : "1.6.0",
"build" : "cdd3ac4",
"http_address" : "inet[localhost/127.0.0.1:9200]",
"attributes" : {
"max_local_storage_nodes" : "1"
},
"jvm" : {
"pid" : 27191,
"version" : "1.7.0_99",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.95-b01",
"vm_vendor" : "Oracle Corporation",
"start_time_in_millis" : 1488928772734,
"mem" : {
"heap_init_in_bytes" : 8284798976,
"heap_max_in_bytes" : 8277131264,
"non_heap_init_in_bytes" : 24313856,
"non_heap_max_in_bytes" : 224395264,
"direct_max_in_bytes" : 8277131264
},
"gc_collectors" : [ "Copy", "ConcurrentMarkSweep" ],
"memory_pools" : [ "Code Cache", "Eden Space", "Survivor Space", "CMS Old Gen", "CMS Perm Gen" ]
}
}
}
}

Re: My Nagios Log Server stop logging every days

Posted: Fri Mar 10, 2017 3:23 pm
by mcapra
This definitely stands out:

Code: Select all

[2017-03-07 20:19:39,971][WARN ][common.network           ] failed to resolve local host, fallback to loopback
java.net.UnknownHostException: localhost.localdomain: localhost.localdomain: Name or service not known
Is this host behind a proxy of some sort? Can you also share the output of:

Code: Select all

cat /etc/hosts
cat /etc/resolv.conf
It looks like localhost isn't resolving at times for some reason and that's probably why Logstash isn't able to find the Elasticsearch cluster.

Re: My Nagios Log Server stop logging every days

Posted: Thu Mar 16, 2017 10:10 am
by juanmafer
Hi!

We found the fail!
We had the server (VM) with one core only.
We added several cores to the system and the server has begun to work fine.
Thanks for your time!

Re: My Nagios Log Server stop logging every days

Posted: Thu Mar 16, 2017 10:40 am
by cdienger
Glad you were able to find the solution! Are we okay closing the thread at this time?