Page 1 of 5
Trying to figure out why logstash changed to active (exited)
Posted: Tue Nov 12, 2019 1:56 pm
by rferebee
Good morning,
The logstash service on one of my Log Server nodes randomly changed status from active (running) to active (exited) this morning. Looking at the logstash log file, I don't see any event that explains why this occurred. We had an issue on 11/9 that caused the snapshot for that evening to stall out and never complete, but I got that resolved this morning and everything seemed fine after. Then at about 10:15 AM the logstash service status flipped.
Can I send someone the log files to take a look at? I would love to know why this happened.
Thank you.
Re: Trying to figure out why logstash changed to active (exi
Posted: Tue Nov 12, 2019 4:14 pm
by mbellerue
Certainly, go ahead and send the system profiles to me, and I will get them to the team for review.
Re: Trying to figure out why logstash changed to active (exi
Posted: Thu Nov 14, 2019 10:32 am
by rferebee
This same behavior occurred this morning. Any ideas as to what might be causing it?
Re: Trying to figure out why logstash changed to active (exi
Posted: Thu Nov 14, 2019 3:11 pm
by mbellerue
Nothing is jumping out at me immediately. On the server that should be primary, could you run dmesg and send me the output? I'm wondering if something is getting logged there that could give us a clue.
Re: Trying to figure out why logstash changed to active (exi
Posted: Thu Nov 14, 2019 4:35 pm
by rferebee
I had to PM you because the output was too many characters.
Re: Trying to figure out why logstash changed to active (exi
Posted: Thu Nov 14, 2019 5:05 pm
by mbellerue
There are a lot of messages for CIFS in there. Are you mounting a file system via Samba? If so, what's it for, and does the other Log Server instance have the same mount? I'm not sure that's the problem, but it definitely stands out.
Re: Trying to figure out why logstash changed to active (exi
Posted: Thu Nov 14, 2019 5:19 pm
by rferebee
Yes. We mount our Log Repository via that CIFS share. Each node has the same corresponding mount command in it's fstab config file.
Our Log Repository is a dedicated NAS device, so it must be mounted as a network drive would be.
This is the mount command we use:
Code: Select all
# CIFS Mount
//10.128.xxx.xxx/NLSREPCC /nlsrepcc cifs rw,username=********,password=********,uid=996,gid=994,file_mode=0770,dir_mode=0770 0 0
The only difference between each node are the UID and GID.
Re: Trying to figure out why logstash changed to active (exi
Posted: Fri Nov 15, 2019 10:03 am
by rferebee
This happened again this morning. I'm curious to know why it's only happening on 1 node. It can't be someone overloading the system with a search that's too large otherwise the whole environment would be down. It's just the logstash service on 1 node.
Re: Trying to figure out why logstash changed to active (exi
Posted: Fri Nov 15, 2019 1:24 pm
by mbellerue
Have you changed any settings in logstash recently?
Can you setup a check on the Log Server that is failing to monitor the number of TCP connections it has? Ideally once per minute would be best.
Re: Trying to figure out why logstash changed to active (exi
Posted: Fri Nov 15, 2019 1:29 pm
by rferebee
We have not made any changes to logstash.
Would you like me to monitor the TCP connections on Log Server as a whole or on each individual node?