We're preparing to migrate from Nagios Log Server 2024R1 to 2024R2. So, I setting up a VMWare environment to test the process. I was able to create a 2024R1with one instance without a problem. I was also able to create a 2024R2 NLS server without problem. However during the process of building a new 2024R2 instance to add to the cluster, the new 2024R2 instance web page doesn't launch properly. I get a continuous "Waiting for Database Startup" Message.
When I install NLS on the new instance, I used (./fullinstall -a) option. It says it successfully complete. I can run the (./fullinstall) option on the new instance and the website launches properly. However, I haven't found a way to add an existing NLS instance to a cluster. It appears you have to make the choice when installing NLS on a server.
- The Nagios KB Article for "Waiting for Database" suggest it could be a server resource problem. So, I doubled the resources on the new instance VM (16-CPUs and 128GB-RAM) with negative results
- The install.log does have any related errors. It does indicate "Nagios Log Server Installation Success!"
- The opensearch logs have the entry "CertificateException: No subject alternative DNS name matching <The Cluster Primary Server FQDN> found." However, the new instance can successfully perform a NSLOOKUP and connect to the Primary Server. Afterwards, I rebuilt the new instance and tried configuring the new instance using the Primary Server IP address vs FQDN. I received the same "Waiting for Database" results. The API Key being used to setup the new instance was taken from the Primary Server.
- I stopped both the opensearch and logstash processes. I started the opensearch process. Allow the database a few minutes to spin up and then I started the logstash process. I received the same "Waiting for Database" message when I navigated to the new instance web GUI.
- There's no indication the Primary Server recognizes the new instance. The Primary Server Cluster only shows 1 instance. Moreover, the new instance doesn't appear anywhere (i.e. logs) on the Primary Server's NLS web GUI.
We're running Nagios Log Server - 2024R2.0.3 on Ubuntu 24.04. Greatly appreciate any suggestions to resolve issue. Much thanks!
Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
- jmichaelson
- Posts: 375
- Joined: Wed Aug 23, 2023 1:02 pm
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
Hi @hebarl,
The first thing I'd like to see is when you log in to your initial node, how many instances does it show? I'm going to guess only one. If that's the case, feel free to remove nagios log server from the second node:
systemctl stop logstash
systemctl stop opensearch
rm -rf /usr/local/nagioslogserver
rm -rf /ar/www/html/nagioslogserver
Then untar the tarball again and rerun the ./fullinstall -a. make sure you provide the correct url to ad a node (http://ip.of.node.1/nagioslogserver) and api key.
if you still have a problem, attaching the install.log from the new node would be helpful.
The first thing I'd like to see is when you log in to your initial node, how many instances does it show? I'm going to guess only one. If that's the case, feel free to remove nagios log server from the second node:
systemctl stop logstash
systemctl stop opensearch
rm -rf /usr/local/nagioslogserver
rm -rf /ar/www/html/nagioslogserver
Then untar the tarball again and rerun the ./fullinstall -a. make sure you provide the correct url to ad a node (http://ip.of.node.1/nagioslogserver) and api key.
if you still have a problem, attaching the install.log from the new node would be helpful.
Please let us know if you have any other questions or concerns.
-Jason
-Jason
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
Hi Jason,
Greatly appreciate your feedback. My Primary NLS server only shows the 1 instance. I've rebuilt the server a number of times to perform a clean NLS install. I get the same results each time. I'll attach the install.log shortly. I did locate a test I can run: logstash --config.test.test_and_exit. I get an ERROR when running it. Here are the results:
=====================================================
ERROR: Pipelines YAML file is empty. Location: /usr/local/nagioslogserver/logstash/config/pipelines.yml
usage:
bin/logstash -f CONFIG_PATH [-t] [-r] [] [-w COUNT] [-l LOG]
bin/logstash --modules MODULE_NAME [-M "MODULE_NAME.var.PLUGIN_TYPE.PLUGIN_NAME.VARIABLE_NAME=VALUE"] [-t] [-w COUNT] [-l LOG]
bin/logstash -e CONFIG_STR [-t] [--log.level fatal|error|warn|info|debug|trace] [-w COUNT] [-l LOG]
bin/logstash -i SHELL [--log.level fatal|error|warn|info|debug|trace]
bin/logstash -V [--log.level fatal|error|warn|info|debug|trace]
bin/logstash --help
[2025-08-27T17:19:28,620][FATAL][org.logstash.Logstash ] Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:808) ~[jruby.jar:?]
at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:767) ~[jruby.jar:?]
at usr.local.nagioslogserver.logstash.lib.bootstrap.environment.<main>(/usr/local/nagioslogserver/logstash/lib/bootstrap/environment.rb:90) ~[?:?]
========================================================
The /usr/local/nagioslogserver/logstash/logs/logstash-plain.log shows a similar error occurring almost every minute or two. Here are the results:
[2025-08-27T17:30:06,689][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2025-08-27T17:30:06,853][INFO ][logstash.config.source.local.configpathloader] No config files found in path {:path=>"/usr/local/nagioslogserver/logstash/etc/conf.d/*"}
[2025-08-27T17:30:06,854][ERROR][logstash.config.sourceloader] No configuration found in the configured sources.
[2025-08-27T17:30:06,893][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2025-08-27T17:30:06,902][INFO ][logstash.runner ] Logstash shut down.
[2025-08-27T17:30:06,905][FATAL][org.logstash.Logstash ] Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
Greatly appreciate your feedback. My Primary NLS server only shows the 1 instance. I've rebuilt the server a number of times to perform a clean NLS install. I get the same results each time. I'll attach the install.log shortly. I did locate a test I can run: logstash --config.test.test_and_exit. I get an ERROR when running it. Here are the results:
=====================================================
ERROR: Pipelines YAML file is empty. Location: /usr/local/nagioslogserver/logstash/config/pipelines.yml
usage:
bin/logstash -f CONFIG_PATH [-t] [-r] [] [-w COUNT] [-l LOG]
bin/logstash --modules MODULE_NAME [-M "MODULE_NAME.var.PLUGIN_TYPE.PLUGIN_NAME.VARIABLE_NAME=VALUE"] [-t] [-w COUNT] [-l LOG]
bin/logstash -e CONFIG_STR [-t] [--log.level fatal|error|warn|info|debug|trace] [-w COUNT] [-l LOG]
bin/logstash -i SHELL [--log.level fatal|error|warn|info|debug|trace]
bin/logstash -V [--log.level fatal|error|warn|info|debug|trace]
bin/logstash --help
[2025-08-27T17:19:28,620][FATAL][org.logstash.Logstash ] Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:808) ~[jruby.jar:?]
at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:767) ~[jruby.jar:?]
at usr.local.nagioslogserver.logstash.lib.bootstrap.environment.<main>(/usr/local/nagioslogserver/logstash/lib/bootstrap/environment.rb:90) ~[?:?]
========================================================
The /usr/local/nagioslogserver/logstash/logs/logstash-plain.log shows a similar error occurring almost every minute or two. Here are the results:
[2025-08-27T17:30:06,689][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2025-08-27T17:30:06,853][INFO ][logstash.config.source.local.configpathloader] No config files found in path {:path=>"/usr/local/nagioslogserver/logstash/etc/conf.d/*"}
[2025-08-27T17:30:06,854][ERROR][logstash.config.sourceloader] No configuration found in the configured sources.
[2025-08-27T17:30:06,893][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2025-08-27T17:30:06,902][INFO ][logstash.runner ] Logstash shut down.
[2025-08-27T17:30:06,905][FATAL][org.logstash.Logstash ] Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
Attached is the NLS install log for your review.
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
.
You do not have the required permissions to view the files attached to this post.
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
I’m trying to add a new 2024R2 instance to an existing Nagios Log Server cluster, but the web GUI for the new node only shows “Waiting for Database Startup.” The install finishes successfully with no errors, and resources have been increased, but the cluster never detects the new instance. Logs show a certificate mismatch error with the primary server FQDN, though DNS resolution works fine. Even after restarting opensearch and logstash, the issue persists. The primary server still shows only one instance. Any guidance on fixing this would be greatly appreciated.
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
Not sure if it's helpful. However, the following was captured from the /var/log/opensearch/nagios_opensearch_server.json file:
{"type": "server", "timestamp": "2025-09-09T09:20:22,974-05:00", "level": "WARN", "component": "o.o.c.c.ClusterFormationFailureHelper", "cluster.name": "nagios_opensearch", "node.name": "146.163.9.223", "message": "cluster-manager not discovered yet, this node has not previously joined a bootstrapped cluster, and this node must discover cluster-manager-eligible nodes [146.163.9.229] to bootstrap a cluster: have discovered [{146.163.9.223}{gmCgdS4HTPGCptbVcoIaZQ}{VlLZth8OSjeqZgCRtTIgRw}{146.163.9.223}{146.163.9.223:9300}{dimr}{shard_indexing_pressure_enabled=true}]; discovery will continue using [146.163.9.229:9300] from hosts providers and [{146.163.9.223}{gmCgdS4HTPGCptbVcoIaZQ}{VlLZth8OSjeqZgCRtTIgRw}{146.163.9.223}{146.163.9.223:9300}{dimr}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
{"type": "server", "timestamp": "2025-09-09T09:20:23,149-05:00", "level": "INFO", "component": "o.o.s.c.ConfigurationRepository", "cluster.name": "nagios_opensearch", "node.name": "146.163.9.223", "message": "Wait for cluster to be available ..." }
The primary cluster server (146.163.9.229) has no record of the instance (146.163.9.223) trying to join the cluster. On the primary cluster server, the cluster status is "yellow" and it only shows one instance (itself).
{"type": "server", "timestamp": "2025-09-09T09:20:22,974-05:00", "level": "WARN", "component": "o.o.c.c.ClusterFormationFailureHelper", "cluster.name": "nagios_opensearch", "node.name": "146.163.9.223", "message": "cluster-manager not discovered yet, this node has not previously joined a bootstrapped cluster, and this node must discover cluster-manager-eligible nodes [146.163.9.229] to bootstrap a cluster: have discovered [{146.163.9.223}{gmCgdS4HTPGCptbVcoIaZQ}{VlLZth8OSjeqZgCRtTIgRw}{146.163.9.223}{146.163.9.223:9300}{dimr}{shard_indexing_pressure_enabled=true}]; discovery will continue using [146.163.9.229:9300] from hosts providers and [{146.163.9.223}{gmCgdS4HTPGCptbVcoIaZQ}{VlLZth8OSjeqZgCRtTIgRw}{146.163.9.223}{146.163.9.223:9300}{dimr}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
{"type": "server", "timestamp": "2025-09-09T09:20:23,149-05:00", "level": "INFO", "component": "o.o.s.c.ConfigurationRepository", "cluster.name": "nagios_opensearch", "node.name": "146.163.9.223", "message": "Wait for cluster to be available ..." }
The primary cluster server (146.163.9.229) has no record of the instance (146.163.9.223) trying to join the cluster. On the primary cluster server, the cluster status is "yellow" and it only shows one instance (itself).
- jmichaelson
- Posts: 375
- Joined: Wed Aug 23, 2023 1:02 pm
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
are you, by any chance, using dhcp for your inital server's ip address?
if so try putting the assigne ip address into /usr/local/nagioslogserver/opensearch/config/opensearch.yml in the network.publish_host and discovery.seed_hosts arrays in place of the localhost that is there.
Then reattempt the install of the additional node if restarting opensearch on that node doesn't work for you.
if so try putting the assigne ip address into /usr/local/nagioslogserver/opensearch/config/opensearch.yml in the network.publish_host and discovery.seed_hosts arrays in place of the localhost that is there.
Then reattempt the install of the additional node if restarting opensearch on that node doesn't work for you.
Please let us know if you have any other questions or concerns.
-Jason
-Jason
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
Appreciate the question. But No - we're using a static IP.
Re: Nagios Log Server - Waiting For Database Startup When Adding a 2024R2 Instance to a Cluster
I've run the 2024R2 fullinstall -a option in various configs (See Below) and in all cases I get the "Waiting for Database Startup" web page instead of the NLS Web GUI. The 2024R2 fullinstall "without" the -a option works fine, which is what is run on the first instance in a cluster. The 2024R2 Web Gui properly launches. The problem exists with the -a option.
============================
Executed install on Oracle Linux distro. Negative results.
Executed install on Ubuntu Linux distro. Negative results.
Executed install using FQDN for the install parameters. Negative results.
Executed install using IP addresses for the install parameters. Negative results.
Executed install with full patched server. Negative results.
Executed install on a server that has not been patched. Negative results.
After complete Linux server rebuilds, I'm unable to install 2024R2 on a secondary instance. I tried the following versions:
nagioslogserver-2024R2.0.1.tar.gz
nagioslogserver-2024R2.0.2.tar.gz
nagioslogserver-2024R2.0.3.tar.gz
nagioslogserver-2024R2.tar.gz
============================
Executed install on Oracle Linux distro. Negative results.
Executed install on Ubuntu Linux distro. Negative results.
Executed install using FQDN for the install parameters. Negative results.
Executed install using IP addresses for the install parameters. Negative results.
Executed install with full patched server. Negative results.
Executed install on a server that has not been patched. Negative results.
After complete Linux server rebuilds, I'm unable to install 2024R2 on a secondary instance. I tried the following versions:
nagioslogserver-2024R2.0.1.tar.gz
nagioslogserver-2024R2.0.2.tar.gz
nagioslogserver-2024R2.0.3.tar.gz
nagioslogserver-2024R2.tar.gz