every day the nagios syslog server stops collecting 8pm

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
fasanchez
Posts: 23
Joined: Wed Jan 10, 2018 9:07 am

every day the nagios syslog server stops collecting 8pm

Post by fasanchez »

Syslog server has stop collecting at 8pm. Have to restart the elasticseach service and everything starts collecting again.

Is there a log I can check to see why elasticseach has stopped?

Nagios Syslog Server v2.0.2
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: every day the nagios syslog server stops collecting 8pm

Post by scottwilkerson »

Yes, you would want to look at

Code: Select all

/var/log/elasticsearch/<CLUSTER_ID>.log
Also, being it is always at 8pm you may want to see if that corresponds to the time a daily job run in Admin -> Command Subsystem

How much memory does this system have? Free disk space?

Is it a single instance or cluster?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
fasanchez
Posts: 23
Joined: Wed Jan 10, 2018 9:07 am

Re: every day the nagios syslog server stops collecting 8pm

Post by fasanchez »

Cluster: each node has 8gb ram and 1tb storage.

339 Shards.

This is from the cluster log.

[2018-09-08 20:00:01,489][DEBUG][action.search.type ] [cd245d02-c856-4b06-9f47-6362077bf0fe] All shards failed for phase: [query]
org.elasticsearch.action.NoShardAvailableActionException: [logstash-2018.09.08][4] null
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:160)
at org.elasticsearch.action.search.type.TransportSearchCountAction.doExecute(TransportSearchCountAction.java:55)
at org.elasticsearch.action.search.type.TransportSearchCountAction.doExecute(TransportSearchCountAction.java:45)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
at org.elasticsearch.client.FilterClient.execute(FilterClient.java:66)
at org.elasticsearch.rest.BaseRestHandler$HeadersAndContextCopyClient.execute(BaseRestHandler.java:92)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:338)
at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:84)
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:53)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:225)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:170)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:327)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:63)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.http.netty.pipelining.HttpPipeliningHandler.messageReceived(HttpPipeliningHandler.java:60)
at org.elasticsearch.common.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:108)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
[2018-09-08 20:00:26,744][DEBUG][action.search.type ] [cd245d02-c856-4b06-9f47-6362077bf0fe] All shards failed for phase: [query]
org.elasticsearch.action.NoShardAvailableActionException: [logstash-2018.09.08][4] null
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: every day the nagios syslog server stops collecting 8pm

Post by cdienger »

Is there anything before this message? You'll also want to check the same log on all the other machines. How many machines are in the cluster? Feel free to PM me a copy of the full logs the next time this happens.

Is the elasticsearch service stopped when it get isn't this state? Run 'service elasticseach status' the next time this occurs before starting it again.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
fasanchez
Posts: 23
Joined: Wed Jan 10, 2018 9:07 am

Re: every day the nagios syslog server stops collecting 8pm

Post by fasanchez »

Seems like the issue was the / file system was at 91%. I setup a repository and did maintenance for any logs older than 120 and it got down to 79% and it seems well now.

Thanks for your help.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: every day the nagios syslog server stops collecting 8pm

Post by cdienger »

Thanks for the update! :)
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked