Could not authenticate. Invalid token given.

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
User avatar
mike4vr
Posts: 89
Joined: Wed Feb 04, 2015 2:23 pm

Could not authenticate. Invalid token given.

Post by mike4vr »

We have two NLS nodes. I'm having an issue with elasticsearch on one of them, I believe. We noticed that the System Status page showed both nodes online. However, in the Instance Status page, it showed node00 with elasticsearch and logstash offline. I proceeded to restart the services in the System Status page. Clicked stop on logstash, stopped. Clicked stopped on elasticsearch, and got an error.

Image

The only way to recover from this is to bounce the server. Even then, I'm back to the same problem with the services showing down on this node in the Instance Status page.

Here are some errors in the elasticsearch log:

Code: Select all

[2015-05-08 09:18:00,482][WARN ][cluster.action.shard     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] [logstash-2015.04.25][2] sending failed shard for [logstash-2015.04.25][2], node[B9U8CoVRRiCRRD7xNfty-g], [R], s[INITIALIZING], indexUUID [q4htz5ElSSyN937VFB030Q], reason [Failed to start shard, message [CorruptIndexException[[logstash-2015.04.25][2] Corrupted index [corrupted_imC91jWVR76Sr4UMRiGl7g] caused by: CorruptIndexException[codec footer mismatch: actual footer=541675078 vs expected footer=-1071082520 (resource: MMapIndexInput(path="/usr/local/nagioslogserver/elasticsearch/data/ae793673-21b1-46f1-ac7f-04fdd6440a60/nodes/0/indices/logstash-2015.04.25/2/index/_c2b_es090_0.tim"))]]]]
[2015-05-08 09:18:00,482][WARN ][cluster.action.shard     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] [logstash-2015.04.25][2] received shard failed for [logstash-2015.04.25][2], node[B9U8CoVRRiCRRD7xNfty-g], [R], s[INITIALIZING], indexUUID [q4htz5ElSSyN937VFB030Q], reason [Failed to start shard, message [CorruptIndexException[[logstash-2015.04.25][2] Corrupted index [corrupted_imC91jWVR76Sr4UMRiGl7g] caused by: CorruptIndexException[codec footer mismatch: actual footer=541675078 vs expected footer=-1071082520 (resource: MMapIndexInput(path="/usr/local/nagioslogserver/elasticsearch/data/ae793673-21b1-46f1-ac7f-04fdd6440a60/nodes/0/indices/logstash-2015.04.25/2/index/_c2b_es090_0.tim"))]]]]
[2015-05-08 09:18:00,482][WARN ][cluster.action.shard     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] [logstash-2015.04.25][2] sending failed shard for [logstash-2015.04.25][2], node[B9U8CoVRRiCRRD7xNfty-g], [R], s[INITIALIZING], indexUUID [q4htz5ElSSyN937VFB030Q], reason [engine failure, message [corrupted preexisting index][CorruptIndexException[[logstash-2015.04.25][2] Corrupted index [corrupted_imC91jWVR76Sr4UMRiGl7g] caused by: CorruptIndexException[codec footer mismatch: actual footer=541675078 vs expected footer=-1071082520 (resource: MMapIndexInput(path="/usr/local/nagioslogserver/elasticsearch/data/ae793673-21b1-46f1-ac7f-04fdd6440a60/nodes/0/indices/logstash-2015.04.25/2/index/_c2b_es090_0.tim"))]]]]
[2015-05-08 09:18:00,483][WARN ][cluster.action.shard     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] [logstash-2015.04.25][2] received shard failed for [logstash-2015.04.25][2], node[B9U8CoVRRiCRRD7xNfty-g], [R], s[INITIALIZING], indexUUID [q4htz5ElSSyN937VFB030Q], reason [engine failure, message [corrupted preexisting index][CorruptIndexException[[logstash-2015.04.25][2] Corrupted index [corrupted_imC91jWVR76Sr4UMRiGl7g] caused by: CorruptIndexException[codec footer mismatch: actual footer=541675078 vs expected footer=-1071082520 (resource: MMapIndexInput(path="/usr/local/nagioslogserver/elasticsearch/data/ae793673-21b1-46f1-ac7f-04fdd6440a60/nodes/0/indices/logstash-2015.04.25/2/index/_c2b_es090_0.tim"))]]]]
[2015-05-08 09:32:20,432][INFO ][monitor.jvm              ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] [gc][young][1033][223] duration [982ms], collections [1]/[2s], total [982ms]/[6s], memory [439.2mb]->[319.2mb]/[921.3mb], all_pools {[young] [118.6mb]->[834.3kb]/[133.1mb]}{[survivor] [5.6mb]->[3.3mb]/[16.6mb]}{[old] [314.9mb]->[315mb]/[771.6mb]}
[2015-05-08 09:41:56,534][INFO ][cluster.metadata         ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] [logstash-2015.05.08] update_mapping [apache_error] (dynamic)
[2015-05-08 09:44:46,912][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] stopping ...
[2015-05-08 09:44:50,163][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] stopped
[2015-05-08 09:44:50,164][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] closing ...
[2015-05-08 09:44:50,324][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] closed
And logstash:

Code: Select all

log4j, [2015-05-08T09:37:15.109]  WARN: org.elasticsearch.transport: [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] Transport response handler not found of id [158]
{:timestamp=>"2015-05-08T09:37:20.144000-0700", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"May  8 08:33:11", :exception=>java.lang.IllegalArgumentException: Invalid format: "May  8 08:33:11", :level=>:warn}
{:timestamp=>"2015-05-08T09:37:20.143000-0700", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"May  8 09:37:05", :exception=>java.lang.IllegalArgumentException: Invalid format: "May  8 09:37:05", :level=>:warn}
{:timestamp=>"2015-05-08T09:37:20.842000-0700", :message=>"Failed parsing date from field", :field=>"timestamp", :value=>"May  8 09:35:56", :exception=>java.lang.IllegalArgumentException: Invalid format: "May  8 09:35:56", :level=>:warn}
Any help is appreciated. I'm ready to provide more info, if needed.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Could not authenticate. Invalid token given.

Post by jolson »

Looks like a corrupt index to me. Would it be alright if we attempted to delete the index on the problem server? You will need to start elasticsearch on the bad node and run the following command:

Code: Select all

curl -XDELETE 'http://localhost:9200/logstash-2015.04.25/'
Keep in mind that using the above command will remove all log information for that day. You will be able to restore the information from a backup if you have it set up.

After deletion, try restarting elasticsearch - let's see if that helps us resolve this problem.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
User avatar
mike4vr
Posts: 89
Joined: Wed Feb 04, 2015 2:23 pm

Re: Could not authenticate. Invalid token given.

Post by mike4vr »

I had the same result after clicking "Stop" on elasticsearch. Here's the latest elasticsearch log:

Code: Select all

[2015-05-08 10:15:39,024][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] version[1.3.2], pid[4816], build[dee175d/2014-08-13T14:29:30Z]
[2015-05-08 10:15:39,024][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] initializing ...
[2015-05-08 10:15:39,156][INFO ][plugins                  ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] loaded [knapsack-1.3.2.0-d5501ef], sites []
[2015-05-08 10:15:59,736][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] initialized
[2015-05-08 10:15:59,736][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] starting ...
[2015-05-08 10:15:59,961][INFO ][transport                ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.115.67:9300]}
[2015-05-08 10:16:00,001][INFO ][discovery                ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] ae793673-21b1-46f1-ac7f-04fdd6440a60/8pZ-dsZpRHGJ1iy89KX37g
[2015-05-08 10:16:00,906][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] stopping ...
[2015-05-08 10:16:00,929][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] stopped
[2015-05-08 10:16:00,930][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] closing ...
[2015-05-08 10:16:00,936][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] closed
[2015-05-08 10:16:29,067][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] version[1.3.2], pid[1553], build[dee175d/2014-08-13T14:29:30Z]
[2015-05-08 10:16:29,086][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] initializing ...
[2015-05-08 10:16:29,118][INFO ][plugins                  ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] loaded [knapsack-1.3.2.0-d5501ef], sites []
[2015-05-08 10:16:35,891][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] initialized
[2015-05-08 10:16:35,892][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] starting ...
[2015-05-08 10:16:36,011][INFO ][transport                ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.115.67:9300]}
[2015-05-08 10:16:36,017][INFO ][discovery                ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] ae793673-21b1-46f1-ac7f-04fdd6440a60/m-lEe1lqQ2264DHt2KBcMQ
[2015-05-08 10:16:39,257][INFO ][cluster.service          ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] detected_master [4197e29a-3751-434e-921b-f8f3ee07edd4][BzXofoK-R9-F8PdDZ6pyyw][lsp-001.la3.4over.com][inet[/192.168.115.163:9300]]{max_local_storage_nodes=1}, added {[4197e29a-3751-434e-921b-f8f3ee07edd4][BzXofoK-R9-F8PdDZ6pyyw][lsp-001.la3.4over.com][inet[/192.168.115.163:9300]]{max_local_storage_nodes=1},}, reason: zen-disco-receive(from master [[4197e29a-3751-434e-921b-f8f3ee07edd4][BzXofoK-R9-F8PdDZ6pyyw][lsp-001.la3.4over.com][inet[/192.168.115.163:9300]]{max_local_storage_nodes=1}])
[2015-05-08 10:16:39,477][INFO ][http                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] bound_address {inet[/127.0.0.1:9200]}, publish_address {inet[localhost/127.0.0.1:9200]}
[2015-05-08 10:16:39,478][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] started
[2015-05-08 10:17:05,724][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] stopping ...
[2015-05-08 10:17:05,934][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] stopped
[2015-05-08 10:17:05,935][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] closing ...
[2015-05-08 10:17:06,681][WARN ][netty.channel.DefaultChannelPipeline] An exception was thrown by an exception handler.
java.util.concurrent.RejectedExecutionException: Worker has already been shutdown
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
	at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
	at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636)
	at org.elasticsearch.common.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)
	at org.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:577)
	at org.elasticsearch.common.netty.channel.Channels.write(Channels.java:704)
	at org.elasticsearch.common.netty.channel.Channels.write(Channels.java:671)
	at org.elasticsearch.common.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
	at org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(NettyTransportChannel.java:97)
	at org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(NettyTransportChannel.java:68)
	at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
	at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
	at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
[2015-05-08 10:17:06,686][INFO ][node                     ] [c26bcd1f-4fb8-4e5d-a947-ecae0509fff0] closed
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Could not authenticate. Invalid token given.

Post by jolson »

Please try pressing "Reset All Jobs" from the command subsystem page. After resetting your jobs, attempt to start elasticsearch from the web GUI. SSH into the node and run:

Code: Select all

service elasticsearch status
To see whether or not it's online. If it does not turn on, try starting it manually and watching the elasticsearch log:

Code: Select all

cat /var/log/elasticsearch/*.log
Any good information from the elasticsearch log? How long does elasticsearch run before it dies - or does it not even start?

I think the reason we're getting the error message is because elasticsearch is having difficulties running in the first place. Can you please check on the above and get back to us with your results?

I haven't seen an issue like this one before, so any additional information that you think may be useful is much appreciated.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
User avatar
mike4vr
Posts: 89
Joined: Wed Feb 04, 2015 2:23 pm

Re: Could not authenticate. Invalid token given.

Post by mike4vr »

Thank you. The suggestions, unfortunately, yield the same results. When I initially went to the Command Subsystem page, elasticsearch was down. I started the service from the command-line and was able to reset the jobs. Unfortunately, I still get the error "Could not authenticate. Invalid token given." I captured the output from *.log in /var/log/elasticsearch into a file for review. It is attached.
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Could not authenticate. Invalid token given.

Post by jolson »

Thanks for the great information.

How much RAM is on the problem node? Any chance you could double the amount to see if that helps out?

I'd like you to check /var/log/messages for any instance of the kernel killing elasticsearch due to memory limitations:

Code: Select all

cat /var/log/messages
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
User avatar
mike4vr
Posts: 89
Joined: Wed Feb 04, 2015 2:23 pm

Re: Could not authenticate. Invalid token given.

Post by mike4vr »

Initially, both nodes had 2GB each. I just upgraded them to 8GB and still see the same problem. /var/log/messages don't show much in terms of the elasticsearch / java being killed. Here's the memory utilization:

Code: Select all

# free -m
             total       used       free     shared    buffers     cached
Mem:          7872       5632       2240          0         46        627
-/+ buffers/cache:       4958       2914
Swap:         4031          0       4031
User avatar
mike4vr
Posts: 89
Joined: Wed Feb 04, 2015 2:23 pm

Re: Could not authenticate. Invalid token given.

Post by mike4vr »

Okay, here's an update.

While tailing the elasticsearch logs on both nodes, I stopped the service on each. Then, I started node00 (faulty one) and it declared itself as the new master. Great. Started node01 and it detected node00 as the master and joined the fun. It all seems like things are working until I click "Stop" on elasticsearch in the web UI. That gets me back to where I started, which is the "Could not authenticate. Invalid token given." error.

One thing to note, this problem started as soon as I upgraded to 2015R1.4. Is there any way I can roll-back to 1.3 without it going haywire?
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Could not authenticate. Invalid token given.

Post by jolson »

One thing to note, this problem started as soon as I upgraded to 2015R1.4. Is there any way I can roll-back to 1.3 without it going haywire?
A rollback has the possibility of working - but I would recommend sticking to your current version if at all possible. The updates that were part of R1.4 should not have affected this component of NLS, leading me to believe that your issue could persist after rolling back.

Could I have you check your http logs please?

Code: Select all

cat /var/log/httpd/error_log
cat /var/log/httpd/access_log
Is your disk space usage fine?

Code: Select all

df -h
How about the age of the 'nagios' account?

Code: Select all

chage -l nagios
Let us know - if you wanted a remote soon we could easily move this to a ticket as well. Thanks!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
User avatar
mike4vr
Posts: 89
Joined: Wed Feb 04, 2015 2:23 pm

Re: Could not authenticate. Invalid token given.

Post by mike4vr »

Yes, remote session sounds great. Please let me know how you'd like to proceed.
Locked