Page 1 of 2
Too many files open error
Posted: Mon Mar 20, 2017 3:01 pm
by CameronWP
Hello:
When I start my Log server, the elasticsearch log is filled with these errors:
[netty.channel.socket.nio.AbstractNioSelector] Failed to accept a connection.
java.io.IOException: Too many open files
I have tried the pool and connection file hotfix, raised the system file limit and raised the logstash file limit all to no avail. Are there any other things I can try?
Thanks!
Re: Too many files open error
Posted: Mon Mar 20, 2017 3:06 pm
by mcapra
Can you try raising the MAX_OPEN_FILES setting in /etc/sysconfig/elasticsearch? Try doubling it, restarting the elasticsearch service, and see if that error is still produced. You'll need to do this for each instance if you have multiples.
Re: Too many files open error
Posted: Tue Mar 21, 2017 7:33 am
by CameronWP
I made that change and restarted and I am having the same issues:
[2017-03-21 08:29:04,213][WARN ][netty.channel.socket.nio.AbstractNioSelector] Failed to accept a connection.
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
at org.elasticsearch.common.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.elasticsearch.common.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Another error:
[2017-03-21 08:28:55,202][WARN ][indices.cluster ] [330efcd2-34fc-4f7f-9cba-df89a1374eee] [[logstash-2017.01.23][1]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [logstash-2017.01.23][1] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:162)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: [logstash-2017.01.23][1] failed to open reader on writer
at org.elasticsearch.index.engine.InternalEngine.createSearcherManager(InternalEngine.java:211)
at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:156)
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:32)
at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1351)
at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1346)
at org.elasticsearch.index.shard.IndexShard.prepareForTranslogRecovery(IndexShard.java:866)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:233)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:112)
Another:
[2017-03-21 08:28:55,208][WARN ][indices.cluster ] [330efcd2-34fc-4f7f-9cba-df89a1374eee] [[logstash-2017.02.12][1]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [logstash-2017.02.12][1] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:162)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: [logstash-2017.02.12][1] failed to open reader on writer
at org.elasticsearch.index.engine.InternalEngine.createSearcherManager(InternalEngine.java:211)
at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:156)
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:32)
at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1351)
at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1346)
at org.elasticsearch.index.shard.IndexShard.prepareForTranslogRecovery(IndexShard.java:866)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:233)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:112)
... 3 more
Caused by: java.nio.file.FileSystemException: /NagLogs/data/46670a84-8052-4f7e-8810-e4bbd8dfdacf/nodes/0/indices/logstash-2017.02.12/1/index/_pdg.cfs: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
at java.nio.channels.FileChannel.open(FileChannel.java:287)
at java.nio.channels.FileChannel.open(FileChannel.java:334)
at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:81)
at org.apache.lucene.store.FileSwitchDirectory.openInput(FileSwitchDirectory.java:172)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:80)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:80)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:80)
at org.elasticsearch.index.store.Store$StoreDirectory.openInput(Store.java:733)
at org.apache.lucene.store.CompoundFileDirectory.<init>(CompoundFileDirectory.java:104)
at org.apache.lucene.index.SegmentReader.readFieldInfos(SegmentReader.java:274)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:107)
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145)
at org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:239)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:109)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:421)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:112)
at org.apache.lucene.search.SearcherManager.<init>(SearcherManager.java:89)
at org.elasticsearch.index.engine.InternalEngine.createSearcherManager(InternalEngine.java:196)
... 10 more
Re: Too many files open error
Posted: Tue Mar 21, 2017 11:39 am
by mcapra
Can you share the outputs of:
Code: Select all
#yum install lsof if needed
lsof | grep logstash | wc -l
netstat -an | wc -l
ulimit -Sn
su nagios
ulimit -Sn
exit
Re: Too many files open error
Posted: Tue Mar 21, 2017 12:05 pm
by CameronWP
Hi:
Here are the results:
ulimit.JPG
Thanks!
Re: Too many files open error
Posted: Tue Mar 21, 2017 3:30 pm
by mcapra
Hmm, how about the output of:
I think we may still be going over this system's hard limit for open files. I just want to rule that out.
Re: Too many files open error
Posted: Wed Mar 22, 2017 7:14 am
by CameronWP
Hi, it is 4096.
Thanks!
Re: Too many files open error
Posted: Wed Mar 22, 2017 2:09 pm
by mcapra
If the hard nofiles limit is 4096, I would suggest increasing that substantially since it appears as though there's over 4000000 files open by Elasticsearch currently. That doesn't mean 4000000 descriptors, but I would wager the hard limit of 4096 is prohibitive in this case.
Re: Too many files open error
Posted: Wed Mar 22, 2017 2:42 pm
by CameronWP
mcapra wrote:If the hard nofiles limit is 4096, I would suggest increasing that substantially since it appears as though there's over 4000000 files open by Elasticsearch currently. That doesn't mean 4000000 descriptors, but I would wager the hard limit of 4096 is prohibitive in this case.
I am worried about this bug:
https://access.redhat.com/solutions/43926
Is there a sane value you have set before?
Thanks!
Re: Too many files open error
Posted: Wed Mar 22, 2017 4:25 pm
by mcapra
You should be fine as long as you follow that article's solution. You can register an account for free to view the solution. Otherwise:
Code: Select all
Login denied after setting nofile limits. Secure log shows the error "error: PAM: pam_open_session(): Permission denied"
Solution Verified - Updated June 30 2016 at 11:43 AM - English
Environment
Red Hat Enterprise Linux 5
Red Hat Enterprise Linux 6
Issue
Setting a hard limit in ulimit on nofile to anything higher than 1048576 (including unlimited) fails, and prevents logins from working.
After setting the following in /etc/security/limits.conf
Raw
* soft nofile 5000000
* hard nofile 5000000
no one can login, and /var/log/secure file shows
Raw
sshd[3889]: error: PAM: pam_open_session(): Permission denied
Resolution
Raise the value of the sysctl setting fs.nr_open (/proc/sys/fs/nr_open) to something greater than or equal to the value being set in ulimit.
Raw
# echo "5000000" > /proc/sys/fs/nr_open
(temporarily)
OR
Raw
# echo "fs.nr_open = 5000000" >> /etc/sysctl.conf
# sysctl -p
(persistently)
Now ulimit should allow a nofile value less than or equal to the fs.nr_open value.
Note
If all login sessions are terminated, it's required to fix /etc/security/limits.conf file by rescue mode.
Root Cause
setrlimit(2) does not allow for the nofile limit to be set to more than fs.nr_open.
Product(s) Red Hat Enterprise Linux
Component pam
Category Learn more
Tags rhel_5
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.