Naglos Log Server/Logstash Collect stops processing logs

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
vconnected
Posts: 7
Joined: Tue May 19, 2015 8:18 am

Naglos Log Server/Logstash Collect stops processing logs

Post by vconnected »

OUt Nagios Log Server stops processing log files every now and then. Meaning, it's no longer storing any incoming logs.
When we restart the Logstash Collector via the GUI it works again.
See attached screenshot.

Since we have an active service contract, can you help us troubleshoot this and fix the issue?
You do not have the required permissions to view the files attached to this post.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by pbroste »

Hello @vconnected

Thanks for reaching out so we can help dial in this issue with you. Want to have you verify that the date/time/timezone did not get off track and then have you send over the System Profile so we can see what is going on.

Code: Select all

date
ls -l /etc/localtime
php -r 'echo date("D M j G:i:s T Y")."\n";'
grep "date.timezone =" /etc/php.ini
grep date.timezone /etc/php.ini
To send the System Profile by:

Code: Select all

/usr/local/nagioslogserver/scripts/profile.sh
This will create /tmp/system-profile.tar.gz.

Note that this file can be very large and may not be able to be uploaded through the ticketing system. You can split the file into smaller files with the split command on the NLS(or other Linux machine) command line:

Code: Select all

split -b 45000000 /tmp/system-profile.tar.gz system-profile- -d
The above command will split the system-profile.tar.gz into 45MB segments and save them to files with the naming convention system-profile-nn. Please send each split in a separate [PM] 'Private Message'.

I'd also like to get a copy of the current settings index. This can be gathered by running:

Code: Select all

curl -XPOST http://localhost:9200/nagioslogserver/_export?path=/tmp/nagioslogserver.tar.gz
The file it creates and that we'd like to see is /tmp/nagioslogserver.tar.gz.

Please send the following:
  • /tmp/nagioslogserver.tar.gz
  • /tmp/system-profile.tar.gz or the splits depending on size
Thanks,
Perry
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by pbroste »

Hello [user]@vconnected[/user]

Thanks for following up and sending over the info in the Private Message.

Want to go ahead and set the system timezone to match the Apache/php. It appears that you are set for US Central (which is Chicago), if it is supposed to be set for something other please make sure that all match.

Code: Select all

ln /etc/ -sf /usr/share/zoneinfo/US/Central
To verify:

Code: Select all

ls -l /etc/localtime
Restart the services by bouncing:

Code: Select all

systemctl restart elasticsearch logstash httpd   #apache2 depending on distro
We do not have a large file transfer method. I know that others have used various web transfers like 'wetransfer', and etc.. You can go ahead size down the compressed archive by removing the extra logstash index logs.

Code: Select all

cd /tmp/

Code: Select all

gzip -d system-profile.tar.gz -c | tar --delete --wildcards system-profile/logstashlogs/logstash.log-*.gz | gzip - > /tmp/tmp.$$.tar.gz && mv /tmp/tmp.$$.tar.gz system-profile.tar.gz
Thanks,
Perry
vconnected
Posts: 7
Joined: Tue May 19, 2015 8:18 am

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by vconnected »

Everything is now set to Europe/Amsterdam.
Attached the much smaller system-profile.tar.gz to a PM.

Hope it can tell the root cause!
connected

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by connected »

Are you sure the ln command is correct? The webserver now completely doesn't come online again.
You do not have the required permissions to view the files attached to this post.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by pbroste »

Hello @vconnected

Thanks for following up, checked the inbox on the profile and the attachment did not make the trip. Appears that we will need to split it a bit more.

The ln command sets up symbolic link for timezone. The -s is soft link, and -f is forcing it. We are doing that since it is already soft linked.

Appears from the screenshot that a 'java' process is taking over and will want to dial that in by looking at 'top' command and/or process:

Code: Select all

ps -aux | grep -Ei 'java'
or

Code: Select all

ps -aux | grep -Ei 'elasticsearch'
Since we know that elasticsearch is associated with java process and you will see that in both commands.

Thanks,
Perry
ScottMc
Posts: 28
Joined: Mon Aug 06, 2018 9:35 am

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by ScottMc »

I have this same problem. It occurs at seemingly random times and will cause Logstash to stop ingesting data, but the service is still running. I have my 10 nodes sitting behind an HAProxy and sometimes I'll see over half just failing a health check. This started happening out of nowhere. I ended up creating a cron job that runs every 5 minutes that tests to see if Logstash is responding on the ingestion ports and if not, restarts the service, but I would love for this to work like it's supposed to.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by pbroste »

Hello @ScottMc

Please provide a profile from the system so we can take a closer look on what is going on. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:

Code: Select all

/usr/local/nagioslogserver/scripts/profile.sh
This will create /tmp/system-profile.tar.gz.

The option to slim it down by removing the Logstash archives by running the following:

Code: Select all

cd /tmp && gzip -d system-profile.tar.gz -c | tar --delete --wildcards system-profile/logstashlogs/log*.*.gz | gzip - > /tmp/tmp.$$.tar.gz && mv /tmp/tmp.$$.tar.gz system-profile.tar.gz
Note; if the file size is large, can be split into smaller chunks by:

Code: Select all

split -b 45000000 /tmp/system-profile.tar.gz system-profile- -d
The above command will split the system-profile.tar.gz into 45MB segments and save them to files with the naming convention system-profile-nn. Please send each split via Private Message.

Thanks,
Perry
vconnected
Posts: 7
Joined: Tue May 19, 2015 8:18 am

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by vconnected »

I ended up redeploying the Nagios Log Server OVA once again. (I always used Vmware OVA from the Nagios download website)
That helped.
This time I didn't update the NIC from E1000 to VMXNET3, not sure if that was the root cause of the issues though.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Naglos Log Server/Logstash Collect stops processing logs

Post by pbroste »

Thanks @vconnected sounds like you were able to figure out a workaround, let us know if you hit further bumps.

Perry
Locked