Cannot access Cluster Status page after 2.1.0 update

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

Good morning, I updated my Log Server environment to 2.1.0 yesterday and this morning I am unable to access the Cluster Status page. I'm getting an HTTP 500 error.

Any suggestions?
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

I also can't access the Index Status page. Same error.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by mbellerue »

Can you run ls -lh /var/www/html/nagioslogserver/application/views/admin/ and post the results here? I'd like to see if the permissions are good on the files for those pages.

Did the upgrade seem successful? No errors or warnings? Do you know if you were able to access these pages yesterday after the upgrade?

You might also grab a system profile if you're going to be ssh'd into the system. To generate a profile from the command line, run /usr/local/nagioslogserver/scripts/profile.sh, and that will drop a system profile in /tmp. You'll have to use a program like WinSCP to download it, though.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

Code: Select all

root@nagioslscc2:/root> ls -lh /var/www/html/nagioslogserver/application/views/admin/
total 320K
-rw-r--r-- 1 root root 5.0K Oct  1 11:37 activate.php
-rw-r--r-- 1 root root  11K Oct  1 11:37 audit_log.php
drwxr-xr-x 2 root root   55 Oct  1 11:37 auth_servers
-rw-r--r-- 1 root root 7.4K Oct  1 11:37 cluster.php
-rw-r--r-- 1 root root  26K Oct  1 11:37 create_user.php
-rw-r--r-- 1 root root  15K Oct  1 11:37 custom_includes.php
-rw-r--r-- 1 root root  28K Oct  1 11:37 edit_user.php
-rw-r--r-- 1 root root 7.0K Oct  1 11:37 globals.php
-rw-r--r-- 1 root root 4.9K Oct  1 11:37 home.php
-rw-r--r-- 1 root root 8.3K Oct  1 11:37 host_lists.php
-rw-r--r-- 1 root root 7.2K Oct  1 11:37 import_users_final.php
-rw-r--r-- 1 root root 2.4K Oct  1 11:37 import_users.php
-rw-r--r-- 1 root root 7.3K Oct  1 11:37 import_users_select.php
-rw-r--r-- 1 root root 3.7K Oct  1 11:37 index_status.php
-rw-r--r-- 1 root root  24K Oct  1 11:37 indices_table.php
-rw-r--r-- 1 root root  46K Oct  1 11:37 instance_status.php
-rw-r--r-- 1 root root 2.6K Oct  1 11:37 leftbar.php
-rw-r--r-- 1 root root 8.4K Oct  1 11:37 license.php
-rw-r--r-- 1 root root 7.9K Oct  1 11:37 mail.php
-rw-r--r-- 1 root root 8.9K Oct  1 11:37 ncpa.php
-rw-r--r-- 1 root root 3.0K Oct  1 11:37 proxy.php
drwxr-xr-x 2 root root   68 Oct  1 11:37 rss
-rw-r--r-- 1 root root  28K Oct  1 11:37 snapshots.php
-rw-r--r-- 1 root root 9.1K Oct  1 11:37 subsystem.php
-rw-r--r-- 1 root root 5.4K Oct  1 11:37 system_status.php
-rw-r--r-- 1 root root 4.5K Oct  1 11:37 users.php
-rw-r--r-- 1 root root    0 Oct  1 11:37 wizards.php
I verified that the permissions match on all 3 of my Log Server nodes.

The upgrade seemed like it completed without issue on every server. The only issue was that logstash service crashed after the update on two out of three of them. I remember specifically looking at the cluster status page after the upgrade to see how many shards needed to be reloaded.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by mbellerue »

This is the first thing that's jumping out at me.

Code: Select all

PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 92 bytes) in /var/www/html/nagioslogserver/application/libraries/Elasticsearch.php on line 0, referer: http://nagioslogserver.state.nv.us/nagioslogserver/admin
From the profile it looks like you have plenty of actual memory available, so I'm wondering if it's running up against the Elasticsearch memory limit. Let's go ahead and try to restart Elasticsearch.

Code: Select all

systemctl restart elasticsearch.service
Give that a shot and let me know how it goes.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

I tried restarting Elasticsearch, but it didn't resolve the issue.

Out of frustration I rebooted all the servers and it's still not loading those two pages.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

It's strange, my snapshots don't seem to be running either.

The command subsystem screen says that they are, but there are no snapshots being created. I just confirmed that I can write to my repository by creating a test folder from one of my log servers to the repository directory. So, I don't think it's a permissions issue.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

It seems like something is wrong, but I don't know where to start:

Code: Select all

root@nagioslscc2:/var/log/elasticsearch> curl -XGET 'http://localhost:9200/_snapshot/nlsrepcc/_all?pretty'
{
  "error" : "RepositoryMissingException[[nlsrepcc] missing]",
  "status" : 404
}
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by mbellerue »

Okay, I think we're going to need to enable PHP error logging for this. In /etc/php.ini, find the entry log_errors and make sure it is set to On. Below it, create another entry called error_log and set it to something like /var/log/php-errors.log. Restart the Apache daemon with systemctl restart httpd.service Then try accessing the pages that aren't working, and see if it logs anything in the file you specified.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by mbellerue »

rferebee wrote:It seems like something is wrong, but I don't know where to start:

Code: Select all

root@nagioslscc2:/var/log/elasticsearch> curl -XGET 'http://localhost:9200/_snapshot/nlsrepcc/_all?pretty'
{
  "error" : "RepositoryMissingException[[nlsrepcc] missing]",
  "status" : 404
}
I did see this error message in the logs, but it looked like nlsrepcc was an NFS share. It doesn't seem to be mounted. Are you storing anything related to Log Server on this share?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked