Nagios Log Server - Waiting For Database Startup

Problem Description

After rebooting one of your Nagios Log Server (NLS) instances you receive the following message on the NLS web page:

Waiting for Database Startup
It looks like your local elasticsearch service is starting.

Why am I getting this error?
Elasticsearch can take a while to start up because of it's indexing. This may take a few seconds.

This page will refresh automatically after 5 seconds ...

However after waiting a few minutes the message persists.

When you look at the /var/log/logstash/logstash.log file you see messages similar to:

[2016-08-31 16:23:18,420][INFO ][discovery.zen            ] [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] failed to send join request to master
[[636f559a-1fd5-4158-9dee-92e1a8403a1e][GkzwBa-QTHCOKiuoWgmR8Q][localhost][inet[/127.0.0.1:9300]]{max_local_storage_nodes=1}],
reason [RemoteTransportException[[6a7ce4ea-e1b9-47a1-af18-1c4d47243d20][inet[/127.0.0.1:9300]][internal:discovery/zen/join]];
nested: ElasticsearchIllegalStateException[Node [[6a7ce4ea-e1b9-47a1-af18-1c4d47243d20][0XjhXD2VRp6K3k8Eq4UDlQ][sa585]
[inet[/172.27.164.109:9300]]{max_local_storage_nodes=1}] not master for join request from [[6a7ce4ea-e1b9-47a1-af18-1c4d47243d20]
[0XjhXD2VRp6K3k8Eq4UDlQ][sa585][inet[/172.27.164.109:9300]]{max_local_storage_nodes=1}]]; ], tried [3] times

Resolution

When this problem is experienced the solution is to reboot ALL of the NLS instances in the cluster. This requires rebooting the entire operating system, not just some of the services on the server.

After rebooting ALL of the NLS instances in the cluster the problem will be resolved.

Final Thoughts

For any support related questions please visit the Nagios Support Forums at:

http://support.nagios.com/forum/