Cannot access Cluster Status page after 2.1.0 update

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

Shoot. I just reset the subsystem thinking maybe there was a stuck job or something.

Before I did that though, it reported the following for snapshots_maintenance. The Job Status was Waiting, the Last Run Status was SUCCESS, the Last Run Time was last night at 22:30:xx and the Next Run Time was 3 days from yesterday.

Attached is a screen shot of the audit log for type = BACKUP.
You do not have the required permissions to view the files attached to this post.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

Are new indices written directly to the Log Repository or do they get moved over during the snapshot_maintenance process?

I'm looking at my repository and I don't see any new indices after 9/17. I think that's why my local storage is being consumed on each Log Server. The indices are not being created/moved to the repository.

Need to figure this out ASAP or I'm going to run out of storage over the weekend. Please help.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by cdienger »

New indices are not written directly to the repo. They are stored locally and then moved over during maintenance.

Sometime a bad backup will cause problems and can be identified by a 0 byte file in the repo. Review the contents of the repo with "ls -alh /nlsrepcc'.

Do you see any curator jobs running if you run "ps aux | grep curator" ?

If you don't see anything running, tail the jobs.log while you then force the snapshots_maintenance job to run:

Code: Select all

tail -f /usr/local/nagioslogserver/var/jobs.log
Let us know if this produces any errors.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

These are the results of your suggestions.

Code: Select all

root@nagioslscc2:/root> ls -alh /nlsrepcc
total 120K
drwxrwx---   2 nagios nagios   0 Oct  3 14:27 .
dr-xr-xr-x. 20 root   root   266 Sep 11 13:18 ..
-rwxrwx---   1 nagios nagios 203 Sep 18 08:14 index
drwxrwx---   2 nagios nagios   0 Sep 29 11:41 indices
-rwxrwx---   1 nagios nagios 504 Jan 29  2019 metadata-curator-20190129123136
-rwxrwx---   1 nagios nagios 504 Feb  2  2019 metadata-curator-20190202123046
-rwxrwx---   1 nagios nagios 524 Feb 19  2019 metadata-curator-20190219123043
-rwxrwx---   1 nagios nagios 504 Mar  1  2019 metadata-curator-20190302063044
-rwxrwx---   1 nagios nagios 522 Apr  2  2019 metadata-curator-20190402161441
-rwxrwx---   1 nagios nagios 522 Apr 21 22:30 metadata-curator-20190422053050
-rwxrwx---   1 nagios nagios 522 May 14 22:31 metadata-curator-20190515053118
-rwxrwx---   1 nagios nagios 522 Jun  2 22:31 metadata-curator-20190603053119
-rwxrwx---   1 nagios nagios 525 Jun 23 22:31 metadata-curator-20190624053125
-rwxrwx---   1 nagios nagios 522 Jul 14 22:31 metadata-curator-20190715053146
-rwxrwx---   1 nagios nagios 522 Aug  4 22:32 metadata-curator-20190805053209
-rwxrwx---   1 nagios nagios 522 Aug 14 22:32 metadata-curator-20190815053239
-rwxrwx---   1 nagios nagios 522 Aug 15 22:32 metadata-curator-20190816053244
-rwxrwx---   1 nagios nagios 522 Sep  4 22:30 metadata-curator-20190905053041
-rwxrwx---   1 nagios nagios 522 Sep 17 22:30 metadata-curator-20190918053035
-rwxrwx---   1 nagios nagios 371 Jan 30  2019 snapshot-curator-20190129123136
-rwxrwx---   1 nagios nagios 361 Feb  2  2019 snapshot-curator-20190202123046
-rwxrwx---   1 nagios nagios 364 Feb 19  2019 snapshot-curator-20190219123043
-rwxrwx---   1 nagios nagios 330 Mar  2  2019 snapshot-curator-20190302063044
-rwxrwx---   1 nagios nagios 483 Apr  2  2019 snapshot-curator-20190402161441
-rwxrwx---   1 nagios nagios 316 Apr 21 23:01 snapshot-curator-20190422053050
-rwxrwx---   1 nagios nagios 320 May 14 23:52 snapshot-curator-20190515053118
-rwxrwx---   1 nagios nagios 316 Jun  2 23:04 snapshot-curator-20190603053119
-rwxrwx---   1 nagios nagios 313 Jun 23 23:07 snapshot-curator-20190624053125
-rwxrwx---   1 nagios nagios 523 Jul 14 23:03 snapshot-curator-20190715053146
-rwxrwx---   1 nagios nagios 315 Aug  4 23:08 snapshot-curator-20190805053209
-rwxrwx---   1 nagios nagios 319 Aug 14 23:21 snapshot-curator-20190815053239
-rwxrwx---   1 nagios nagios 313 Sep  4 23:19 snapshot-curator-20190905053041
-rwxrwx---   1 nagios nagios 317 Sep 17 23:14 snapshot-curator-20190918053035
drwxrwx---   2 nagios nagios   0 Oct  3 07:14 test_folder
root@nagioslscc2:/root> ps aux | grep curator
root     14512  0.0  0.0 112712   968 pts/0    S+   12:37   0:00 grep --color=auto curator
root@nagioslscc2:/root> tail -f /usr/local/nagioslogserver/var/jobs.log
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Running command do_maintenance with args ' ' for job id: snapshots_maintenance
Traceback (most recent call last):
  File "/usr/bin/curator", line 7, in <module>
    from curator.curator import main
  File "/usr/lib/python2.7/site-packages/curator/__init__.py", line 2, in <module>
    from .api import *
  File "/usr/lib/python2.7/site-packages/curator/api/__init__.py", line 1, in <module>
    from .utils import *
  File "/usr/lib/python2.7/site-packages/curator/api/utils.py", line 2, in <module>
    import elasticsearch
  File "/usr/lib/python2.7/site-packages/elasticsearch/__init__.py", line 17, in <module>
    from .client import Elasticsearch
  File "/usr/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 5, in <module>
    from ..transport import Transport
  File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 5, in <module>
    from .connection import Urllib3HttpConnection
  File "/usr/lib/python2.7/site-packages/elasticsearch/connection/__init__.py", line 3, in <module>
    from .http_urllib3 import Urllib3HttpConnection
  File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 2, in <module>
    import urllib3
  File "/usr/lib/python2.7/site-packages/urllib3/__init__.py", line 10, in <module>
    from .connectionpool import (
  File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 31, in <module>
    from .connection import (
  File "/usr/lib/python2.7/site-packages/urllib3/connection.py", line 45, in <module>
    from .util.ssl_ import (
  File "/usr/lib/python2.7/site-packages/urllib3/util/__init__.py", line 4, in <module>
    from .request import make_headers
  File "/usr/lib/python2.7/site-packages/urllib3/util/request.py", line 5, in <module>
    from ..exceptions import UnrewindableBodyError
ImportError: cannot import name UnrewindableBodyError
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh close indices --older-than 22 --time-unit days --timestring %Y.%m.%d
Return: 1

-----
Traceback (most recent call last):
  File "/usr/bin/curator", line 7, in <module>
    from curator.curator import main
  File "/usr/lib/python2.7/site-packages/curator/__init__.py", line 2, in <module>
    from .api import *
  File "/usr/lib/python2.7/site-packages/curator/api/__init__.py", line 1, in <module>
    from .utils import *
  File "/usr/lib/python2.7/site-packages/curator/api/utils.py", line 2, in <module>
    import elasticsearch
  File "/usr/lib/python2.7/site-packages/elasticsearch/__init__.py", line 17, in <module>
    from .client import Elasticsearch
  File "/usr/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 5, in <module>
    from ..transport import Transport
  File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 5, in <module>
    from .connection import Urllib3HttpConnection
  File "/usr/lib/python2.7/site-packages/elasticsearch/connection/__init__.py", line 3, in <module>
    from .http_urllib3 import Urllib3HttpConnection
  File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 2, in <module>
    import urllib3
  File "/usr/lib/python2.7/site-packages/urllib3/__init__.py", line 10, in <module>
    from .connectionpool import (
  File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 31, in <module>
    from .connection import (
  File "/usr/lib/python2.7/site-packages/urllib3/connection.py", line 45, in <module>
    from .util.ssl_ import (
  File "/usr/lib/python2.7/site-packages/urllib3/util/__init__.py", line 4, in <module>
    from .request import make_headers
  File "/usr/lib/python2.7/site-packages/urllib3/util/request.py", line 5, in <module>
    from ..exceptions import UnrewindableBodyError
ImportError: cannot import name UnrewindableBodyError
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh delete indices --older-than 22 --time-unit days --timestring %Y.%m.%d
Return: 1

-----
Traceback (most recent call last):
  File "/usr/bin/curator", line 7, in <module>
    from curator.curator import main
  File "/usr/lib/python2.7/site-packages/curator/__init__.py", line 2, in <module>
    from .api import *
  File "/usr/lib/python2.7/site-packages/curator/api/__init__.py", line 1, in <module>
    from .utils import *
  File "/usr/lib/python2.7/site-packages/curator/api/utils.py", line 2, in <module>
    import elasticsearch
  File "/usr/lib/python2.7/site-packages/elasticsearch/__init__.py", line 17, in <module>
    from .client import Elasticsearch
  File "/usr/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 5, in <module>
    from ..transport import Transport
  File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 5, in <module>
    from .connection import Urllib3HttpConnection
  File "/usr/lib/python2.7/site-packages/elasticsearch/connection/__init__.py", line 3, in <module>
    from .http_urllib3 import Urllib3HttpConnection
  File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 2, in <module>
    import urllib3
  File "/usr/lib/python2.7/site-packages/urllib3/__init__.py", line 10, in <module>
    from .connectionpool import (
  File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 31, in <module>
    from .connection import (
  File "/usr/lib/python2.7/site-packages/urllib3/connection.py", line 45, in <module>
    from .util.ssl_ import (
  File "/usr/lib/python2.7/site-packages/urllib3/util/__init__.py", line 4, in <module>
    from .request import make_headers
  File "/usr/lib/python2.7/site-packages/urllib3/util/request.py", line 5, in <module>
    from ..exceptions import UnrewindableBodyError
ImportError: cannot import name UnrewindableBodyError
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh snapshot --repository "NLSREPCC" --ignore_unavailable indices --older-than 1 --time-unit days --timestring %Y.%m.%d
Return: 1

-----
Traceback (most recent call last):
  File "/usr/bin/curator", line 7, in <module>
    from curator.curator import main
  File "/usr/lib/python2.7/site-packages/curator/__init__.py", line 2, in <module>
    from .api import *
  File "/usr/lib/python2.7/site-packages/curator/api/__init__.py", line 1, in <module>
    from .utils import *
  File "/usr/lib/python2.7/site-packages/curator/api/utils.py", line 2, in <module>
    import elasticsearch
  File "/usr/lib/python2.7/site-packages/elasticsearch/__init__.py", line 17, in <module>
    from .client import Elasticsearch
  File "/usr/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 5, in <module>
    from ..transport import Transport
  File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 5, in <module>
    from .connection import Urllib3HttpConnection
  File "/usr/lib/python2.7/site-packages/elasticsearch/connection/__init__.py", line 3, in <module>
    from .http_urllib3 import Urllib3HttpConnection
  File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 2, in <module>
    import urllib3
  File "/usr/lib/python2.7/site-packages/urllib3/__init__.py", line 10, in <module>
    from .connectionpool import (
  File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 31, in <module>
    from .connection import (
  File "/usr/lib/python2.7/site-packages/urllib3/connection.py", line 45, in <module>
    from .util.ssl_ import (
  File "/usr/lib/python2.7/site-packages/urllib3/util/__init__.py", line 4, in <module>
    from .request import make_headers
  File "/usr/lib/python2.7/site-packages/urllib3/util/request.py", line 5, in <module>
    from ..exceptions import UnrewindableBodyError
ImportError: cannot import name UnrewindableBodyError
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh delete snapshots --older-than 731 --time-unit days --timestring %Y%m%d --repository "NLSREPCC"
Return: 1

-----
SUCCESS
Processed 0 node jobs.
Processed 1 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
I don't see any errors, but the snapshot was not created. It looks like it ended right after the job was started.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by cdienger »

There appears to be an issue with a urllib3 python package that got installed. Run the following to remove it and install one that should work:

Code: Select all

pip uninstall urllib3
yum install python-urllib3
Then run the maintenance job again while tailing the jobs.log to make sure it works.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

It appears to be running now. Thank you!

Do I need to be worried about these errors?

Code: Select all

Error deleting one or more indices.
Got a TIMEOUT response from Elasticsearch. Error message: HTTPConnectionPool (host=u'localhost', port=9200): Read timed out. (read timeout=30)
Indices failed to delete:
***List of indices***

Got a 404 response from Elasticsearch. Error message: IndexMissingException
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by cdienger »

You likely have a lot of indices that have backed up and resulting in a few timeouts. I wouldn't be too worried about this now as long as things are getting deleted and moved over to the repo. If they continue into next week we can check into it.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

Awesome, thanks very much!
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by cdienger »

No problem. Have a good weekend!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Cannot access Cluster Status page after 2.1.0 update

Post by rferebee »

Good morning, it looks like the snapshot that ran on Friday completed successfully, so we're back up and running.

Thanks for your support as always.
Locked