Page 1 of 2
Nagios suddenly stopped sending logs
Posted: Mon Mar 18, 2019 4:16 am
by tcsdi
Our Nagios Log Server suddenly stopped sending logs to our ELK SIEM, this started around March 8 when we checked the status. I have also attached a screenshot of the history of the logs sent.
How can I determine if my Nagios is functioning properly? It is seen sending logs, but very low as compared to what it used to send.
Re: Nagios suddenly stopped sending logs
Posted: Mon Mar 18, 2019 2:52 pm
by npolovenko
Hello,
@tcsdi. Please generate and send me a profile from each log server in the cluster. A profile can be generated under Admin > System > System Status or in the command line by running:
/usr/local/nagioslogserver/scripts/profile.sh
The profile can be found at /tmp/system-profile.tar.gz.
Re: Nagios suddenly stopped sending logs
Posted: Tue Mar 19, 2019 3:51 am
by tcsdi
npolovenko wrote:Hello,
@tcsdi. Please generate and send me a profile from each log server in the cluster. A profile can be generated under Admin > System > System Status or in the command line by running:
/usr/local/nagioslogserver/scripts/profile.sh
The profile can be found at /tmp/system-profile.tar.gz.
Hi
@npolovenko,
Thanks for the reply, please see attached logs

Re: Nagios suddenly stopped sending logs
Posted: Tue Mar 19, 2019 3:54 am
by tcsdi
npolovenko wrote:Hello,
@tcsdi. Please generate and send me a profile from each log server in the cluster. A profile can be generated under Admin > System > System Status or in the command line by running:
/usr/local/nagioslogserver/scripts/profile.sh
The profile can be found at /tmp/system-profile.tar.gz.
Hi
@npolovenko,
Thanks for the reply, here is the profile of the server

Re: Nagios suddenly stopped sending logs
Posted: Tue Mar 19, 2019 4:27 pm
by npolovenko
@tcsdi, Thanks! I'm seeing that Log Server indices were critical since January. It seems that insufficient ram was the biggest issue that caused the elasticsearch to fail.
java.lang.OutOfMemoryError: Java heap space
My recommendation would be to increase the RAM to at least 8gb on the production system. Then delete critical indexes and restore them from a backup:
https://support.nagios.com/kb/article.php?id=90
Re: Nagios suddenly stopped sending logs
Posted: Wed Mar 20, 2019 9:13 pm
by tcsdi
npolovenko wrote:@tcsdi, Thanks! I'm seeing that Log Server indices were critical since January. It seems that insufficient ram was the biggest issue that caused the elasticsearch to fail.
java.lang.OutOfMemoryError: Java heap space
My recommendation would be to increase the RAM to at least 8gb on the production system. Then delete critical indexes and restore them from a backup:
https://support.nagios.com/kb/article.php?id=90
Hi
@npolovenko,
We upgraded the RAM to 8GB now but the server will stop after a few hours. I have attached another profile for you to look at.
Re: Nagios suddenly stopped sending logs
Posted: Thu Mar 21, 2019 3:19 pm
by npolovenko
@tcsdi, I'm not seeing anything out of the ordinary in the profile so far. Can you take a new screenshot of the Data Source graph?
Could you also clarify which filters are shown on your graph? I see that the graph has many colors. Is it an all-inclusive graph for all outputs or just for one particular output?
Re: Nagios suddenly stopped sending logs
Posted: Fri Mar 22, 2019 2:31 am
by tcsdi
npolovenko wrote:@tcsdi, I'm not seeing anything out of the ordinary in the profile so far. Can you take a new screenshot of the Data Source graph?
Could you also clarify which filters are shown on your graph? I see that the graph has many colors. Is it an all-inclusive graph for all outputs or just for one particular output?
Hi
@npolovenkko, please see attached log for the list of all sources.
Re: Nagios suddenly stopped sending logs
Posted: Fri Mar 22, 2019 2:13 pm
by npolovenko
@tcsdi, I noticed that you're sending some logs twice to two different destinations. For example:
Code: Select all
if [type] =~ /(dnslog)/ {
syslog {
host => "172.31.108.236"
port => 1523
sourcehost=> "10.5.115.106"
}
}
if [type] =~ /(dnslog)/ {
syslog {
host => "172.31.108.236"
port => 1523
sourcehost=> "10.5.115.107"
}
}
Or:
Code: Select all
if [type] =~ /(eventlog)/ {
syslog {
host => "172.31.108.236"
port => 1522
sourcehost=> "10.5.115.106"
codec => json {
charset => 'CP1252'
}
}
}
if [type] =~ /(eventlog)/ {
syslog {
host => "172.31.108.236"
port => 1522
sourcehost=> "10.5.115.107"
codec => json {
charset => 'CP1252'
}
}
I wonder if that could be causing issues. Are these two sourcehosts clustered? Could you disable duplicate output filters for some time to see if the performance will improve?
Re: Nagios suddenly stopped sending logs
Posted: Mon Mar 25, 2019 10:15 am
by tcsdi
npolovenko wrote:@tcsdi, I noticed that you're sending some logs twice to two different destinations. For example:
Code: Select all
if [type] =~ /(dnslog)/ {
syslog {
host => "172.31.108.236"
port => 1523
sourcehost=> "10.5.115.106"
}
}
if [type] =~ /(dnslog)/ {
syslog {
host => "172.31.108.236"
port => 1523
sourcehost=> "10.5.115.107"
}
}
Or:
Code: Select all
if [type] =~ /(eventlog)/ {
syslog {
host => "172.31.108.236"
port => 1522
sourcehost=> "10.5.115.106"
codec => json {
charset => 'CP1252'
}
}
}
if [type] =~ /(eventlog)/ {
syslog {
host => "172.31.108.236"
port => 1522
sourcehost=> "10.5.115.107"
codec => json {
charset => 'CP1252'
}
}
I wonder if that could be causing issues. Are these two sourcehosts clustered? Could you disable duplicate output filters for some time to see if the performance will improve?
Hi
@npolovenko,
Yes the two sourcehosts are clustered but they were already configured that way since the beginning, even when the server is doing fine. Are there any other configuration that can cause the issue?