Login times out or errors out

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
burkm
Posts: 31
Joined: Thu Jan 21, 2016 5:10 pm

Login times out or errors out

Post by burkm »

Hello,

I'm not able to login to my cluster; when I try to login to either server it times out.
On one of the servers, the login times out and returns "The username specified does not exist." The login uses LDAP.

I'm running version 2.0.3
I've restarted elasticsearch, logstash, and httpd multiple times on both servers
There's plenty of disk space
I've increased the memory_limit in /etc/php.ini to 512MB

Running

Code: Select all

curl -XGET 'http://localhost:9200/_cat/indices?pretty'
returns 2272 indices.

I've run the profile.sh script; shall I send in the output?

Thanks,
Michael
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Login times out or errors out

Post by npolovenko »

Hello, @burkm. Yes, please send in the profile. After you run the script it should generate a profile archive in the /tmp folder.
When did the log server become unresponsive? Check out backups in this folder:

Code: Select all

 /store/backups/nagioslogserver/
Then run the folowing commands to restore from a backup. *But replace the nagioslogserver.2017-05-10.1494373596.tar.gz with the actual backup that was taken beofre any issues.

Code: Select all

cd /usr/local/nagioslogserver/scripts/
./restore_backup.sh /store/backups/nagioslogserver/nagioslogserver.2017-05-10.1494373596.tar.gz
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Login times out or errors out

Post by npolovenko »

@burkm, I received and reviewed the profile. It's unfortunate that you don't have any backups. Have you changed the backups location manually by chance?

Your profile shows that you have red clusters. Please follow this article in order to attempt to fix the clusters:
https://support.nagios.com/kb/article.php?id=90
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
burkm
Posts: 31
Joined: Thu Jan 21, 2016 5:10 pm

Re: Login times out or errors out

Post by burkm »

I deleted some of the oldest indices that had red status and then left it alone for a couple of hours.
When I checked it again the status had turned to yellow, and the unassigned shards had dropped greatly from a high > 16K.
Looks like it's slowly returning to normal on its own.

Code: Select all

$ curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "76900ee2-f769-413c-9948-850204a96b32",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 10942,
  "active_shards" : 19574,
  "relocating_shards" : 0,
  "initializing_shards" : 2,
  "unassigned_shards" : 2308,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}
After it goes green, I'm thinking I should delete indices > 1 year old to avoid this problem in the future. Would that help?
burkm
Posts: 31
Joined: Thu Jan 21, 2016 5:10 pm

Re: Login times out or errors out

Post by burkm »

Everything is back to normal now:

Code: Select all

$ curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "76900ee2-f769-413c-9948-850204a96b32",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 3802,
  "active_shards" : 7604,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}
I filled in values for closing and deleting indices, and it seems to have trimmed the db significantly. I'm impressed that Elasticsearch could recover fully from such a mess!

I'm guessing that what started all this was running out of memory; I noticed last week that the resident size was greater than the system's total RAM. Both servers have 48GB now, so hopefully I won't run into that problem again.

Thanks for your help!
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Login times out or errors out

Post by npolovenko »

@burkm, Glad you were able to resolve this! Don't forget to make some backups of your Log Server now, so If something breaks down you could revert in a few minutes:
https://assets.nagios.com/downloads/nag ... Server.pdf

Closing the thread as resolved.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked