NLS upgrade failing - NLS not available

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
jabi27
Posts: 34
Joined: Thu Jan 19, 2017 4:30 pm

NLS upgrade failing - NLS not available

Post by jabi27 »

We have a 2 node cluster and after/doing upgrade we are not able to start/login to NLS.

Here is what we did:
We have:
- nagios-logserver1.stil.dk/
- nagios-logserver2.stil.dk

# close shards

Code: Select all

curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.enable":"none"}}'
server #2:

Code: Select all

mkdir /root/upgrade_2.1.6
cd /root/upgrade_2.1.6
wget -O upgrade.sh https://assets.nagios.com/downloads/nagios-log-server/upgrade.sh
chmod 700 upgrade.sh
./upgrade
Completed successfully
server #1:

Code: Select all

mkdir /root/upgrade_2.1.6
cd /root/upgrade_2.1.6
wget -O upgrade.sh https://assets.nagios.com/downloads/nagios-log-server/upgrade.sh
./upgrade
..
Kibana upgraded OK
..
Hanging forever ...
The hanging line in upgrade seems to be:

Code: Select all

/usr/bin/php $proddir/www/index.php install/upgrade/$oldversion
Even the first server was upgraded we are not able to use it.

The system is down. Can you advice ?

Best

/Jan
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: NLS upgrade failing - NLS not available

Post by cdienger »

Is the server that is hanging forever still hanging or did you exit out of it?

What version of NLS are you upgrading from?

Check that the services are running:

Code: Select all

systemctl status elasticsearch
systemctl status logstash
systemctl status httpd
Also check the status of the cluster:

Code: Select all

curl 'localhost:9200/_cat/nodes?v'
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
Run the second command a couple times to see if the numbers are changing.

Please PM me a profile from both systems. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:

Code: Select all

/usr/local/nagioslogserver/scripts/profile.sh
This will create /tmp/system-profile.tar.gz.

Note that this file can be very large and may not be able to be uploaded through the system. This is usually due to the logs in the Logstash and/or Elasticsearch directories found in it. If it is too large, please open the profile, extract these directories/files and send them separately.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
jabi27
Posts: 34
Joined: Thu Jan 19, 2017 4:30 pm

Re: NLS upgrade failing - NLS not available

Post by jabi27 »

Hi

Yes the services are running:
root@nagios-logserver1:~# systemctl status elasticsearch | grep running
Active: active (running) since Fri 2020-07-10 12:31:44 CEST; 19h ago
root@nagios-logserver1:~# systemctl status logstash | grep running
Active: active (running) since Fri 2020-07-10 14:40:47 CEST; 17h ago
root@nagios-logserver1:~# systemctl status apache2 | grep running
Active: active (running) since Fri 2020-07-10 13:02:13 CEST; 18h ago
I do not know for sure what version we was coming from but I think it was 2.1.2

Code: Select all

root@nagios-logserver2:~# curl 'localhost:9200/_cat/nodes?v'
..
..
Hanging...

Code: Select all

root@nagios-logserver2:~# curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "a70328ae-b00b-42d8-a48e-8607a24bb151",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 167,
  "active_shards" : 167,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 177,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}
The time the Kibana.. was hanging was around 20-30 min (lunch break). There was no activity from the process.

I will run and uploade the system profiles.

Thanks and Best

/Jan
jabi27
Posts: 34
Joined: Thu Jan 19, 2017 4:30 pm

Re: NLS upgrade failing - NLS not available

Post by jabi27 »

Hi,

I forgot, yes I did a Ctrl-c to the process.

Best

/Jan
jabi27
Posts: 34
Joined: Thu Jan 19, 2017 4:30 pm

Re: NLS upgrade failing - NLS not available

Post by jabi27 »

This eventually went back:

Code: Select all

root@nagios-logserver1:~#  curl 'localhost:9200/_cat/nodes?v'
host              ip              heap.percent ram.percent load node.role master name                                 
nagios-logserver1 195.231.242.169           99          14 1.43 d         *      4f455685-4ec4-42f4-932e-54121c3871af 
jabi27
Posts: 34
Joined: Thu Jan 19, 2017 4:30 pm

Re: NLS upgrade failing - NLS not available

Post by jabi27 »

And now server 2 finished

Code: Select all

host              ip              heap.percent ram.percent load node.role master name                                 
nagios-logserver1 195.231.242.169           99          14 2.84 d         *      4f455685-4ec4-42f4-932e-54121c3871af 

User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: NLS upgrade failing - NLS not available

Post by cdienger »

Thanks for the update and data. I've taken ownership of the ticket you've opened for this case. We'll close out this thread and I will respond shortly to the ticket.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked