"Waiting for Database Startup" for 14 hours

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Post Reply
waskinbas
Posts: 13
Joined: Wed Jun 02, 2021 1:59 pm

"Waiting for Database Startup" for 14 hours

Post by waskinbas »

Hello! Yesterday I powered off my Nagios LS server (the basic OVF I've used for months) and moved it between datastores to get ready for a full CentOS install. When the OVF powered up, it gave me the "Waiting for Database Startup" page and has remained there for about 14 hours. I read a bunch of forum posts and followed several suggestions that have helped others. I have tried increasing the https.max, I have restored a configuration backup using the built-in script, and I have stopped and restarted the elasticsearch, logstash, and http services several times. I have also tried rebooting several times. I'd like to avoid restoring the entire server, so any suggestions would be helpful. I see similar, abandoned topics for this topic. I'm hoping this isn't a critical issue.
kg2857
Posts: 68
Joined: Wed Apr 12, 2023 5:48 pm

Re: "Waiting for Database Startup" for 14 hours

Post by kg2857 »

That means the elasticsearch service is failing to start.
I've seen that sometimes when this happens the cluster_hosts (I think) may need to be edited to contain the IP or hostname of eash cluster host. Since it sounds like you only have one NLS host, this may not be an issue.
waskinbas
Posts: 13
Joined: Wed Jun 02, 2021 1:59 pm

Re: "Waiting for Database Startup" for 14 hours

Post by waskinbas »

It is a stand alone server. I had a space crunch a while back and had to remove the second server from the cluster in a hurry. I did try commenting out the removed servers in cluster_hosts, but it didn't help ElasticSearch start up.

I've decided to build a new CentOS Stream 9 server and copy my backups and log data to the new machine. However... I can't locate my snapshot repository. I tried using curl -XGET but it won't work either. Does anyone know what the default snapshot repository path is?
kg2857
Posts: 68
Joined: Wed Apr 12, 2023 5:48 pm

Re: "Waiting for Database Startup" for 14 hours

Post by kg2857 »

Might want to look at the ES log file and delete any lines in cluster_hosts other than the one host.
waskinbas
Posts: 13
Joined: Wed Jun 02, 2021 1:59 pm

Re: "Waiting for Database Startup" for 14 hours

Post by waskinbas »

For anyone looking at this in the future, looks like the default datastore is /usr/local/nagioslogserver/elasticsearch/data/

It appears that the metadata files that my Elasticsearch needed to open the indexes were gone. Not sure why moving them between data stores wiped out those files, but they were gone. I ended up building a CentOS Stream 9 server and restoring a system backup from the OVF there.
ElaineOsborne
Posts: 2
Joined: Wed May 10, 2023 9:12 pm

Re: "Waiting for Database Startup" for 14 hours

Post by ElaineOsborne »

waskinbas wrote: Thu May 04, 2023 8:54 am Hello! Yesterday I powered off my Nagios LS server (the basic OVF I've used for months) and moved it between datastores to get ready for a full CentOS install. When the OVF powered up, it gave me the "Waiting for Database Startup" page and has remained there for about 14 hours. I read a bunch of forum posts and followed several suggestions that have helped others. I have tried increasing the https.max, I have restored a configuration backup using the built-in script, and I have stopped and restarted the elasticsearch, logstash, and http services several times. I have also tried rebooting several times. I'd like to avoid restoring the entire server, so any suggestions would be helpful. I see similar, abandoned topics for this topic. I'm hoping this isn't a critical issue.
Here are a few steps you can try to troubleshoot the problem:

Verify the connectivity: Ensure that the network connectivity between the Nagios LS server and the database server is intact. Check if you can ping the database server from the Nagios LS server and vice versa.

Check database status: Make sure that the database service is running properly. You mentioned you restarted the Elasticsearch service, but also verify the status of the database used by Nagios LS. Check the logs for any relevant error messages related to the database startup.

Verify database configuration: Double-check the configuration files for the database and ensure that they are correctly set up. Pay attention to any specific settings related to database startup or connection parameters.

Clear caches: Sometimes, cached data can cause issues during startup. Try clearing the caches of Nagios LS and the database. This can typically be done by stopping the relevant services, deleting any cache directories, and then starting the services again.

Monitor system resources: Check if the Nagios LS server is experiencing high resource usage during startup. Monitor CPU, memory, and disk usage to see if any bottlenecks are causing delays. You can use tools like top or htop to monitor resource usage.

Check Nagios LS logs: Look for any error or warning messages in the Nagios LS logs that could provide more insight into the issue. The logs are often located in the /var/log directory or a subdirectory specific to Nagios LS.

Consult the Nagios LS community: If the issue persists and you haven't found a solution yet, consider reaching out to the Nagios LS community. Forums, mailing lists, or official support channels can be valuable resources to seek assistance from experienced users and developers.
If you want to find APK apps and games, the Getmodnow website is the choice for you with safe and free.
waskinbas
Posts: 13
Joined: Wed Jun 02, 2021 1:59 pm

Re: "Waiting for Database Startup" for 14 hours

Post by waskinbas »

Very nice summary, thank you! I ended up installing CentOS Stream 9 and restoring my backups to the new host. So far so good. I will keep your post bookmarked, though!
kg2857
Posts: 68
Joined: Wed Apr 12, 2023 5:48 pm

Re: "Waiting for Database Startup" for 14 hours

Post by kg2857 »

This means ES isn't starting. Fix that.
Post Reply