Naglos Log Server/Logstash Collect stops processing logs
-
- Posts: 7
- Joined: Tue May 19, 2015 8:18 am
Naglos Log Server/Logstash Collect stops processing logs
OUt Nagios Log Server stops processing log files every now and then. Meaning, it's no longer storing any incoming logs.
When we restart the Logstash Collector via the GUI it works again.
See attached screenshot.
Since we have an active service contract, can you help us troubleshoot this and fix the issue?
When we restart the Logstash Collector via the GUI it works again.
See attached screenshot.
Since we have an active service contract, can you help us troubleshoot this and fix the issue?
You do not have the required permissions to view the files attached to this post.
Re: Naglos Log Server/Logstash Collect stops processing logs
Hello @vconnected
Thanks for reaching out so we can help dial in this issue with you. Want to have you verify that the date/time/timezone did not get off track and then have you send over the System Profile so we can see what is going on.
To send the System Profile by:
This will create /tmp/system-profile.tar.gz.
Note that this file can be very large and may not be able to be uploaded through the ticketing system. You can split the file into smaller files with the split command on the NLS(or other Linux machine) command line:
The above command will split the system-profile.tar.gz into 45MB segments and save them to files with the naming convention system-profile-nn. Please send each split in a separate [PM] 'Private Message'.
I'd also like to get a copy of the current settings index. This can be gathered by running:
The file it creates and that we'd like to see is /tmp/nagioslogserver.tar.gz.
Please send the following:
Perry
Thanks for reaching out so we can help dial in this issue with you. Want to have you verify that the date/time/timezone did not get off track and then have you send over the System Profile so we can see what is going on.
Code: Select all
date
ls -l /etc/localtime
php -r 'echo date("D M j G:i:s T Y")."\n";'
grep "date.timezone =" /etc/php.ini
grep date.timezone /etc/php.ini
Code: Select all
/usr/local/nagioslogserver/scripts/profile.sh
Note that this file can be very large and may not be able to be uploaded through the ticketing system. You can split the file into smaller files with the split command on the NLS(or other Linux machine) command line:
Code: Select all
split -b 45000000 /tmp/system-profile.tar.gz system-profile- -d
I'd also like to get a copy of the current settings index. This can be gathered by running:
Code: Select all
curl -XPOST http://localhost:9200/nagioslogserver/_export?path=/tmp/nagioslogserver.tar.gz
Please send the following:
- /tmp/nagioslogserver.tar.gz
- /tmp/system-profile.tar.gz or the splits depending on size
Perry
Re: Naglos Log Server/Logstash Collect stops processing logs
Hello [user]@vconnected[/user]
Thanks for following up and sending over the info in the Private Message.
Want to go ahead and set the system timezone to match the Apache/php. It appears that you are set for US Central (which is Chicago), if it is supposed to be set for something other please make sure that all match.
To verify:
Restart the services by bouncing:
We do not have a large file transfer method. I know that others have used various web transfers like 'wetransfer', and etc.. You can go ahead size down the compressed archive by removing the extra logstash index logs.
Thanks,
Perry
Thanks for following up and sending over the info in the Private Message.
Want to go ahead and set the system timezone to match the Apache/php. It appears that you are set for US Central (which is Chicago), if it is supposed to be set for something other please make sure that all match.
Code: Select all
ln /etc/ -sf /usr/share/zoneinfo/US/Central
Code: Select all
ls -l /etc/localtime
Code: Select all
systemctl restart elasticsearch logstash httpd #apache2 depending on distro
Code: Select all
cd /tmp/
Code: Select all
gzip -d system-profile.tar.gz -c | tar --delete --wildcards system-profile/logstashlogs/logstash.log-*.gz | gzip - > /tmp/tmp.$$.tar.gz && mv /tmp/tmp.$$.tar.gz system-profile.tar.gz
Perry
-
- Posts: 7
- Joined: Tue May 19, 2015 8:18 am
Re: Naglos Log Server/Logstash Collect stops processing logs
Everything is now set to Europe/Amsterdam.
Attached the much smaller system-profile.tar.gz to a PM.
Hope it can tell the root cause!
Attached the much smaller system-profile.tar.gz to a PM.
Hope it can tell the root cause!
Re: Naglos Log Server/Logstash Collect stops processing logs
Are you sure the ln command is correct? The webserver now completely doesn't come online again.
You do not have the required permissions to view the files attached to this post.
Re: Naglos Log Server/Logstash Collect stops processing logs
Hello @vconnected
Thanks for following up, checked the inbox on the profile and the attachment did not make the trip. Appears that we will need to split it a bit more.
The ln command sets up symbolic link for timezone. The -s is soft link, and -f is forcing it. We are doing that since it is already soft linked.
Appears from the screenshot that a 'java' process is taking over and will want to dial that in by looking at 'top' command and/or process:
or
Since we know that elasticsearch is associated with java process and you will see that in both commands.
Thanks,
Perry
Thanks for following up, checked the inbox on the profile and the attachment did not make the trip. Appears that we will need to split it a bit more.
The ln command sets up symbolic link for timezone. The -s is soft link, and -f is forcing it. We are doing that since it is already soft linked.
Appears from the screenshot that a 'java' process is taking over and will want to dial that in by looking at 'top' command and/or process:
Code: Select all
ps -aux | grep -Ei 'java'
Code: Select all
ps -aux | grep -Ei 'elasticsearch'
Thanks,
Perry
Re: Naglos Log Server/Logstash Collect stops processing logs
I have this same problem. It occurs at seemingly random times and will cause Logstash to stop ingesting data, but the service is still running. I have my 10 nodes sitting behind an HAProxy and sometimes I'll see over half just failing a health check. This started happening out of nowhere. I ended up creating a cron job that runs every 5 minutes that tests to see if Logstash is responding on the ingestion ports and if not, restarts the service, but I would love for this to work like it's supposed to.
Re: Naglos Log Server/Logstash Collect stops processing logs
Hello @ScottMc
Please provide a profile from the system so we can take a closer look on what is going on. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:
This will create /tmp/system-profile.tar.gz.
The option to slim it down by removing the Logstash archives by running the following:
Note; if the file size is large, can be split into smaller chunks by:
The above command will split the system-profile.tar.gz into 45MB segments and save them to files with the naming convention system-profile-nn. Please send each split via Private Message.
Thanks,
Perry
Please provide a profile from the system so we can take a closer look on what is going on. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:
Code: Select all
/usr/local/nagioslogserver/scripts/profile.sh
The option to slim it down by removing the Logstash archives by running the following:
Code: Select all
cd /tmp && gzip -d system-profile.tar.gz -c | tar --delete --wildcards system-profile/logstashlogs/log*.*.gz | gzip - > /tmp/tmp.$$.tar.gz && mv /tmp/tmp.$$.tar.gz system-profile.tar.gz
Code: Select all
split -b 45000000 /tmp/system-profile.tar.gz system-profile- -d
Thanks,
Perry
-
- Posts: 7
- Joined: Tue May 19, 2015 8:18 am
Re: Naglos Log Server/Logstash Collect stops processing logs
I ended up redeploying the Nagios Log Server OVA once again. (I always used Vmware OVA from the Nagios download website)
That helped.
This time I didn't update the NIC from E1000 to VMXNET3, not sure if that was the root cause of the issues though.
That helped.
This time I didn't update the NIC from E1000 to VMXNET3, not sure if that was the root cause of the issues though.
Re: Naglos Log Server/Logstash Collect stops processing logs
Thanks @vconnected sounds like you were able to figure out a workaround, let us know if you hit further bumps.
Perry
Perry