Nagios log server not receiving any logs

thanigaivel.a · Post by **thanigaivel.a** » Mon Jul 09, 2018 10:29 am

It looks Nagios log server not receiving any logs from any of the server from 02Jun onwards.

Instead, getting the below error.

The instance reports that it's local Logstash is not running. You will not be able to collect logs on this instance until you start Logstash.

At the same time, service status in the server looks good.

[root@usa0300lv6332 ~]# service logstash status
Logstash Daemon● logstash.service - LSB: Logstash
Loaded: loaded (/etc/rc.d/init.d/logstash; bad; vendor preset: disabled)
Active: active (running) since Mon 2018-07-09 11:12:53 EDT; 2s ago
Docs: man:systemd-sysv-generator(8)
Process: 11231 ExecStop=/etc/rc.d/init.d/logstash stop (code=exited, status=0/SUCCESS)
Process: 11293 ExecStart=/etc/rc.d/init.d/logstash start (code=exited, status=0/SUCCESS)
CGroup: /system.slice/logstash.service
├─11308 runuser -s /bin/sh -c exec /usr/local/nagioslogserver/logstash/bin/logstash agent -f /usr/local/nagioslogserver/logstash/etc/conf.d -l /var/log/logstash/logstash.log -w 4 root
└─11310 java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Djava.io.tmpdir=/usr/local/nagioslogse...

Jul 09 11:12:53 usa0300lv6332 systemd[1]: Starting LSB: Logstash...
Jul 09 11:12:53 usa0300lv6332 logstash[11293]: PHP Warning: Module 'SourceGuardian' already loaded in Unknown on line 0
Jul 09 11:12:53 usa0300lv6332 runuser[11308]: pam_unix(runuser:session): session opened for user root by (uid=0)
Jul 09 11:12:53 usa0300lv6332 logstash[11293]: Starting Logstash Daemon: [ OK ]
Jul 09 11:12:53 usa0300lv6332 systemd[1]: Started LSB: Logstash.

[root@usa0300lv6332 ~]# service elasticsearch status
● elasticsearch.service - LSB: This service manages the elasticsearch daemon
Loaded: loaded (/etc/rc.d/init.d/elasticsearch; bad; vendor preset: disabled)
Active: active (running) since Mon 2018-07-09 10:24:26 EDT; 1h 3min ago
Docs: man:systemd-sysv-generator(8)
Process: 1434 ExecStart=/etc/rc.d/init.d/elasticsearch start (code=exited, status=0/SUCCESS)
CGroup: /system.slice/elasticsearch.service
└─1768 java -Xms6926m -Xmx6926m -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicit...

Jul 09 10:24:26 usa0300lv6332 systemd[1]: Starting LSB: This service manages the elasticsearch daemon...
Jul 09 10:24:26 usa0300lv6332 elasticsearch[1434]: PHP Warning: Module 'SourceGuardian' already loaded in Unknown on line 0
Jul 09 10:24:26 usa0300lv6332 runuser[1739]: pam_unix(runuser:session): session opened for user nagios by (uid=0)
Jul 09 10:24:26 usa0300lv6332 elasticsearch[1434]: Starting elasticsearch: [ OK ]
Jul 09 10:24:26 usa0300lv6332 systemd[1]: Started LSB: This service manages the elasticsearch daemon.
[root@usa0300lv6332 ~]#

Post by **cdienger** » Mon Jul 09, 2018 1:17 pm

The output shows logstash has only been running for 2 seconds. It's likely crashing frequently and this is usually due to bad configs. Looking at the input config, the thing that pops out immediately is that there's been an additional syslog input to listen on port 514. 514 is considered a privileged port requiring some changes to the OS or logstash config before it can be opened:

https://assets.nagios.com/downloads/nag ... Server.pdf

thanigaivel.a · Post by **thanigaivel.a** » Mon Jul 09, 2018 3:37 pm

Yes, we have opened 514 port to allow some of our network devices. also, utilization is very high in this server, due to which its getting hang.

so how to fix this issue.

[root@usa0300lv6332 rsyslog.d]# top
top - 16:32:35 up 6:08, 2 users, load average: 5.21, 4.18, 3.46
Tasks: 274 total, 1 running, 273 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.4 us, 2.1 sy, 0.0 ni, 48.8 id, 46.4 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 14185472 total, 162016 free, 8868040 used, 5155416 buff/cache
KiB Swap: 4095996 total, 4083636 free, 12360 used. 4723212 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
61026 nagios 20 0 19.429g 7.419g 144512 S 34.1 54.8 3:28.39 java

[root@usa0300lv6332 rsyslog.d]# telnet 0 514
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
^]
telnet> q
Connection closed.
[root@usa0300lv6332 rsyslog.d]# telnet 0 5544
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
^]
telnet> q
Connection closed.
[root@usa0300lv6332 rsyslog.d]#

Post by **cdienger** » Mon Jul 09, 2018 4:18 pm

/var/log/logstash/logstash.log may contain some clues. If that doesn't help you resolve the problem, please PM me a profile(Admin > System > System Status > Download System Profile). Please open and split it if it is too large to send.

thanigaivel.a · Post by **thanigaivel.a** » Tue Jul 10, 2018 7:16 am

Had already uploaded the profile,

logstash log is not generating any logs within the file(refer below), so sending the last two available logs. It has been noted that, between Jun28th & 09Jul logstash log was not generated.

logstash.log-20180629.gz
logstash.log-20180710.gz

[root@usa0300lv6332 logstash]# pwd
/var/log/logstash
[root@usa0300lv6332 logstash]# ls -rlth
total 12M
-rw-r----- 1 nagios nagios 3.8M May 31 03:19 logstash.log-20180531.gz
-rw-r----- 1 nagios nagios 4.1M Jun 1 03:12 logstash.log-20180601.gz
-rw-r----- 1 nagios nagios 2.3M Jun 2 03:26 logstash.log-20180602.gz
-rw-r----- 1 nagios nagios 984K Jun 3 03:42 logstash.log-20180603.gz
-rw-r----- 1 nagios nagios 337K Jun 3 17:11 logstash.log-20180604.gz
-rw-r----- 1 nagios nagios 3.1K Jun 28 08:10 logstash.log-20180629.gz
-rw-r----- 1 nagios nagios 3.8K Jul 9 16:28 logstash.log-20180710.gz
-rw-r----- 1 nagios nagios 0 Jul 10 03:23 logstash.log

thanigaivel.a · Post by **thanigaivel.a** » Tue Jul 10, 2018 7:19 am

Also, providing logstash config file below: (/etc/sysconfig/logstash)

Code: Select all

[root@usa0300lv6332 logstash]# cat /etc/sysconfig/logstash
###############################
# Default settings for logstash
###############################

# Override Java location
#JAVACMD=/usr/bin/java

# Set a home directory
APP_DIR=/usr/local/nagioslogserver
LS_HOME="$APP_DIR/logstash"

# set ES_CLUSTER
ES_CLUSTER=$(cat $APP_DIR/var/cluster_uuid)

# Arguments to pass to java
#LS_HEAP_SIZE="256m"
LS_JAVA_OPTS="-Djava.io.tmpdir=$APP_DIR/tmp"

# Logstash filter worker threads
#LS_WORKER_THREADS=1

# pidfiles aren't used for upstart; this is for sysv users.
#LS_PIDFILE=/var/run/logstash.pid

# user id to be invoked as; for upstart: edit /etc/init/logstash.conf
#LS_USER=nagios
LS_USER=root
#LS_GROUP=root
LS_GROUP=nagios

# logstash logging
#LS_LOG_FILE=/var/log/logstash/logstash.log
#LS_USE_GC_LOGGING="true"

# logstash configuration directory
LS_CONF_DIR="$LS_HOME/etc/conf.d"

# Open file limit; cannot be overridden in upstart
#LS_OPEN_FILES=2048

# Nice level
#LS_NICE=0

# Increase Filter workers to 4 threads
LS_OPTS=" -w 4"

if [ "x$1" == "xstart" -o "x$1" == "xrestart" -o "x$1" == "xreload" ];then
        GET_LOGSTASH_CONFIG_MESSAGE=$( php /usr/local/nagioslogserver/scripts/get_logstash_config.php )
        GET_LOGSTASH_CONFIG_RETURN=$?
        if [ "$GET_LOGSTASH_CONFIG_RETURN" != "0" ]; then
                echo $GET_LOGSTASH_CONFIG_MESSAGE
                exit 1
        fi
fi

setcap 'cap_net_bind_service=+ep' $(readlink -f $(which java))
[root@usa0300lv6332 logstash]#

Post by **cdienger** » Tue Jul 10, 2018 2:20 pm

The "setcap 'cap_net_bind_service=+ep' $(readlink -f $(which java))" line in /etc/sysconfig/logstash is not standard. Try deleting it and then restarting logstash:

service logstash restart

thanigaivel.a · Post by **thanigaivel.a** » Tue Jul 10, 2018 2:46 pm

Removed the line and restarted the logstash service, however logserver not receiving any logs from any of the server.

Log after restart

[root@usa0300lv6332 logstash]# tail -f logstash.log
{:timestamp=>"2018-07-10T15:39:42.784000-0400", :message=>"SIGTERM received. Shutting down the agent.", :level=>:warn}
{:timestamp=>"2018-07-10T15:39:42.785000-0400", :message=>"stopping pipeline", :id=>"main"}
{:timestamp=>"2018-07-10T15:40:42.077000-0400", :message=>"Pipeline main started"}

scottwilkerson · Post by **scottwilkerson** » Tue Jul 10, 2018 3:34 pm

Actually we apologize, I think Craig wasn't familiar that is required to receive logs on privileged ports per this doc
https://assets.nagios.com/downloads/nag ... Server.pdf

Please add this back to /etc/sysconfig/logstash

Code: Select all

setcap 'cap_net_bind_service=+ep' $(readlink -f $(which java))

then restart logstash

Code: Select all

service restart logstash

now lets test the config

Code: Select all

/usr/local/nagioslogserver/logstash/bin/logstash --configtest -f /usr/local/nagioslogserver/logstash/etc/conf.d/

thanigaivel.a · Post by **thanigaivel.a** » Wed Jul 11, 2018 12:29 pm

Had added the line back to config file and restarted the logstash service, kindly find the output below.

[root@usa0300lv6332 ~]#
[root@usa0300lv6332 ~]# service logstash restart
Restarting logstash (via systemctl): [ OK ]
[root@usa0300lv6332 ~]# cd /var/log/logstash/
[root@usa0300lv6332 logstash]# ls -rlth
total 7.7M
-rw-r----- 1 nagios nagios 4.1M Jun 1 03:12 logstash.log-20180601.gz
-rw-r----- 1 nagios nagios 2.3M Jun 2 03:26 logstash.log-20180602.gz
-rw-r----- 1 nagios nagios 984K Jun 3 03:42 logstash.log-20180603.gz
-rw-r----- 1 nagios nagios 337K Jun 3 17:11 logstash.log-20180604.gz
-rw-r----- 1 nagios nagios 3.1K Jun 28 08:10 logstash.log-20180629.gz
-rw-r----- 1 nagios nagios 3.8K Jul 9 16:28 logstash.log-20180710.gz
-rw-r----- 1 nagios nagios 2.7K Jul 10 16:05 logstash.log-20180711.gz
-rw-r----- 1 nagios nagios 211 Jul 11 13:26 logstash.log
[root@usa0300lv6332 logstash]# less logstash.log
[root@usa0300lv6332 logstash]#
[root@usa0300lv6332 logstash]#
[root@usa0300lv6332 logstash]# /usr/local/nagioslogserver/logstash/bin/logstash --configtest -f /usr/local/nagioslogserver/logstash/etc/conf.d/
Configuration OK
[root@usa0300lv6332 logstash]# cat logstash.log
{:timestamp=>"2018-07-11T13:26:14.690000-0400", :message=>"SIGTERM received. Shutting down the agent.", :level=>:warn}
{:timestamp=>"2018-07-11T13:26:14.742000-0400", :message=>"stopping pipeline", :id=>"main"}
{:timestamp=>"2018-07-11T13:27:09.656000-0400", :message=>"Pipeline main started"}
[root@usa0300lv6332 logstash]#

Code: Select all

[root@usa0300lv6332 logstash]# cat /etc/sysconfig/logstash
###############################
# Default settings for logstash
###############################

# Override Java location
#JAVACMD=/usr/bin/java

# Set a home directory
APP_DIR=/usr/local/nagioslogserver
LS_HOME="$APP_DIR/logstash"

# set ES_CLUSTER
ES_CLUSTER=$(cat $APP_DIR/var/cluster_uuid)

# Arguments to pass to java
#LS_HEAP_SIZE="256m"
LS_JAVA_OPTS="-Djava.io.tmpdir=$APP_DIR/tmp"

# Logstash filter worker threads
#LS_WORKER_THREADS=1

# pidfiles aren't used for upstart; this is for sysv users.
#LS_PIDFILE=/var/run/logstash.pid

# user id to be invoked as; for upstart: edit /etc/init/logstash.conf
#LS_USER=nagios
LS_USER=root
#LS_GROUP=root
LS_GROUP=nagios

# logstash logging
#LS_LOG_FILE=/var/log/logstash/logstash.log
#LS_USE_GC_LOGGING="true"

# logstash configuration directory
LS_CONF_DIR="$LS_HOME/etc/conf.d"

# Open file limit; cannot be overridden in upstart
#LS_OPEN_FILES=2048

# Nice level
#LS_NICE=0

# Increase Filter workers to 4 threads
LS_OPTS=" -w 4"

if [ "x$1" == "xstart" -o "x$1" == "xrestart" -o "x$1" == "xreload" ];then
        GET_LOGSTASH_CONFIG_MESSAGE=$( php /usr/local/nagioslogserver/scripts/get_logstash_config.php )
        GET_LOGSTASH_CONFIG_RETURN=$?
        if [ "$GET_LOGSTASH_CONFIG_RETURN" != "0" ]; then
                echo $GET_LOGSTASH_CONFIG_MESSAGE
                exit 1
        fi
fi
setcap 'cap_net_bind_service=+ep' $(readlink -f $(which java))
[root@usa0300lv6332 logstash]#

Nagios Support Forum

Nagios log server not receiving any logs

Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs

Re: Nagios log server not receiving any logs