Backup & Maintenance not working automatically

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
TEWLS
Posts: 33
Joined: Wed Dec 28, 2016 3:53 pm

Backup & Maintenance not working automatically

Post by TEWLS »

I am having an issue with maintenance working.. I followed the post below at https://support.nagios.com/forum/viewto ... 38&t=39929 to get a manual backup so I know at least that works, and I can restore and delete from a created snapshot. Basically everything is good if a person does it.

The optimize indexes doesn't seem to actually do anything, the close indexes older than does nothing, and the delete indexes option does nothing.

Attached a SS of my settings.
You do not have the required permissions to view the files attached to this post.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Backup & Maintenance not working automatically

Post by mcapra »

TEWLS wrote:The optimize indexes doesn't seem to actually do anything, the close indexes older than does nothing, and the delete indexes option does nothing.
Can you share the outputs of the following command executed from the CLI of any Nagios Log Server machine in this cluster:

Code: Select all

curl -XGET 'http://localhost:9200/nagioslogserver/commands/_search?size=100'
Can you also run this tail:

Code: Select all

tail -f /usr/local/nagioslogserver/var/jobs.log
Then, while the tail is running, hit the "Reset All Jobs" button and let that churn through:
2017_04_03_11_21_40_Command_Subsystem_Nagios_Log_Server.png
Then, when all the jobs have turned over to "SUCCESS", share the result of that previous tail.
You do not have the required permissions to view the files attached to this post.
Former Nagios employee
https://www.mcapra.com/
TEWLS
Posts: 33
Joined: Wed Dec 28, 2016 3:53 pm

Re: Backup & Maintenance not working automatically

Post by TEWLS »

Curl:

Code: Select all

{10}# curl -XGET 'http://localhost:9200/nagioslogserver/commands/_search?size=100'
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":11,"max_score":1.0,"hits":[{"_index":"nagioslogserver","_type":"commands","_id":"AVs0ERpnoihlocwoCTyE","_score":1.0,"_source":{"created":"2017-04-03 08:46:27","active":1,"status":"completed","type":"user","node":"760c9dfb-3788-4fd1-a55e-dd0d53d67def","command":"create_backup","run_time":1491227187,"last_run_status":"SUCCESS","last_run_time":"2017-04-03 08:46:37"}},{"_index":"nagioslogserver","_type":"commands","_id":"backup_maintenance","_score":1.0,"_source":{"created":"2017-02-21 07:45:42","created_by":"1","active":1,"status":"waiting","type":"system","node":"global","command":"do_maintenance","run_time":1491313586,"frequency":"86400","last_run_status":"SUCCESS","last_run_output":"Maintenance and Backup jobs are being executed","last_run_time":"2017-04-03 08:46:26"}},{"_index":"nagioslogserver","_type":"commands","_id":"backups","_score":1.0,"_source":{"created":"2017-02-21 07:45:42","created_by":"1","active":1,"status":"waiting","type":"system","node":"global","command":"do_backups","run_time":1491313587,"frequency":"86400","last_run_status":"SUCCESS","last_run_time":"2017-04-03 08:46:27"}},{"_index":"nagioslogserver","_type":"commands","_id":"AVs0ERpMoihlocwoCTyD","_score":1.0,"_source":{"created":"2017-04-03 08:46:27","active":1,"status":"completed","type":"user","node":"4203c74b-084b-44a1-a358-165b46bdf788","command":"create_backup","run_time":1491227187,"last_run_status":"SUCCESS","last_run_time":"2017-04-03 08:46:35"}},{"_index":"nagioslogserver","_type":"commands","_id":"AVs0ERqkoihlocwoCTyF","_score":1.0,"_source":{"created":"2017-04-03 08:46:27","active":1,"status":"completed","type":"user","node":"5d4fe840-7e2d-4d21-847a-8212998b8f63","command":"create_backup","run_time":1491227187,"last_run_status":"SUCCESS","last_run_time":"2017-04-03 08:46:38"}},{"_index":"nagioslogserver","_type":"commands","_id":"AVrNEZmW4fUEtcHYoDgK","_score":1.0,"_source":{"created":"2017-03-14 08:46:07","active":1,"status":"running","type":"user","node":"531669aa-f711-4655-b4f5-7d887ecd64e2","command":"create_backup","run_time":1489499167}},{"_index":"nagioslogserver","_type":"commands","_id":"run_update_check","_score":1.0,"_source":{"created":"2017-02-21 07:45:42","created_by":"1","active":1,"status":"waiting","type":"system","node":"global","command":"update_check","run_time":1491313587,"frequency":"86400","last_run_status":"SUCCESS","last_run_time":"2017-04-03 08:46:30"}},{"_index":"nagioslogserver","_type":"commands","_id":"AVs0ERo-oihlocwoCTyC","_score":1.0,"_source":{"created":"2017-04-03 08:46:27","active":1,"status":"completed","type":"user","node":"531669aa-f711-4655-b4f5-7d887ecd64e2","command":"create_backup","run_time":1491227187,"last_run_status":"SUCCESS","last_run_time":"2017-04-03 08:46:37"}},{"_index":"nagioslogserver","_type":"commands","_id":"cleanup_cmdsubsys","_score":1.0,"_source":{"created":"2017-02-21 07:45:42","created_by":"1","active":1,"status":"waiting","type":"system","node":"global","command":"cleanup","run_time":1491241101,"frequency":"3600","last_run_status":"SUCCESS","last_run_time":"2017-04-03 11:38:21"}},{"_index":"nagioslogserver","_type":"commands","_id":"AVrNEZmm4fUEtcHYoDgM","_score":1.0,"_source":{"created":"2017-03-14 08:46:07","active":1,"status":"running","type":"user","node":"5d4fe840-7e2d-4d21-847a-8212998b8f63","command":"create_backup","run_time":1489499167}},{"_index":"nagioslogserver","_type":"commands","_id":"run_all_alerts","_score":1.0,"_source":{"created":"2017-02-21 07:45:42","created_by":"1","active":1,"status":"waiting","type":"system","node":"global","command":"run_alerts","run_time":1491239702,"frequency":"20","last_run_status":"SUCCESS","last_run_time":"2017-04-03 12:14:42"}}]}}
Jobs Log:

Code: Select all

tail -f /usr/local/nagioslogserver/var/jobs.log
PHP Warning:  Module 'SourceGuardian' already loaded in Unknown on line 0
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
PHP Warning:  Module 'SourceGuardian' already loaded in Unknown on line 0
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
PHP Warning:  Module 'SourceGuardian' already loaded in Unknown on line 0
Running command create_backup with args ' ' for job id: AVs00s10YI5_IFXliRhp
/usr/bin/python: No module named jsonselect
/usr/bin/python: No module named jsonselect
/usr/bin/python: No module named jsonselect
/usr/bin/python: No module named jsonselect
/usr/bin/python: No module named jsonselect
/usr/bin/python: No module named jsonselect
tar: nagioslogserver.2017-04-03.1491239886/nagioslogserver_log.tar.gz: file changed as we read it
SUCCESS
Processed 1 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
PHP Warning:  Module 'SourceGuardian' already loaded in Unknown on line 0
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
PHP Warning:  Module 'SourceGuardian' already loaded in Unknown on line 0
^[OM^[OP^[OPProcessed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
PHP Warning:  Module 'SourceGuardian' already loaded in Unknown on line 0
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Backup & Maintenance not working automatically

Post by mcapra »

I think the missing Python dependency is probably a good place to start:

Code: Select all

/usr/bin/python: No module named jsonselect
Can you install the jsonselect plugin via pip? Or if your setup is offline, you'll need to download and install the plugin by hand:
https://pypi.python.org/pypi/jsonselect
Former Nagios employee
https://www.mcapra.com/
TEWLS
Posts: 33
Joined: Wed Dec 28, 2016 3:53 pm

Re: Backup & Maintenance not working automatically

Post by TEWLS »

It is already installed. The change we made wouldn't effect this, correct?

Code: Select all

{5}# aa /usr/lib/python2.7/site-packages/ |grep json
drwx------.  2 root root 4.0K Nov 17 10:22 jsonselect
drwx------.  2 root root 4.0K Nov 17 10:22 jsonselect-0.2.3-py2.7.egg-info

{6}# pip install jsonselect
Requirement already satisfied: jsonselect in /usr/lib/python2.7/site-packages
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Backup & Maintenance not working automatically

Post by mcapra »

TEWLS wrote:The change we made wouldn't effect this, correct?
Could you clarify what you mean by this?

It's also worth attempting a dry-run of the various curator commands that get run on the back-end. Can you share the outputs of:

Code: Select all

su nagios
curator --dry-run delete indices --older-than 7 --time-unit days --timestring %Y.%m.%d
curator --dry-run close indices --older-than 7 --time-unit days --timestring %Y.%m.%d
curator --dry-run optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
exit
And if that all goes well, you can try closing indices for real:

Code: Select all

curl -XGET 'http://localhost:9200/_cat/indices/logstash-*'
su nagios
curator --debug close indices --older-than 7 --time-unit days --timestring %Y.%m.%d
exit
curl -XGET 'http://localhost:9200/_cat/indices/logstash-*'
Former Nagios employee
https://www.mcapra.com/
TEWLS
Posts: 33
Joined: Wed Dec 28, 2016 3:53 pm

Re: Backup & Maintenance not working automatically

Post by TEWLS »

mcapra wrote:
TEWLS wrote:The change we made wouldn't effect this, correct?
Could you clarify what you mean by this?
We changed document.ready to window.load. Also I am using ACLs on all of my files to reflect the below file:

Code: Select all

# file: elasticsearch/
# owner: nagios
# group: nagios
user::rwx
user:apache:rwx
user:nagios:rwx
group::---
group:apache:rwx
group:nagios:rwx
mask::rwx
other::---
default:user::rwx
default:user:apache:rwx
default:user:nagios:rwx
default:group::---
default:group:apache:rwx
default:group:nagios:rwx
default:mask::rwx
default:other::---
Dry Runs:

Code: Select all

curator --dry-run delete indices --older-than 7 --time-unit days --timestring %Y.%m.%d
2017-04-03 14:56:58,160 INFO      Job starting: delete indices
2017-04-03 14:56:58,173 INFO      Pruning Kibana-related indices to prevent accidental deletion.
2017-04-03 14:56:58,173 INFO      Action delete will be performed on the following indices: [u'logstash-2017.03.27']
2017-04-03 14:56:58,173 INFO      DRY RUN MODE.  No changes will be made.
2017-04-03 14:56:58,217 INFO      DRY RUN: delete: logstash-2017.03.27

curator --dry-run close indices --older-than 7 --time-unit days --timestring %Y.%m.%d
2017-04-03 14:56:58,360 INFO      Job starting: close indices
2017-04-03 14:56:58,372 INFO      Action close will be performed on the following indices: [u'logstash-2017.03.27']
2017-04-03 14:56:58,372 INFO      DRY RUN MODE.  No changes will be made.
2017-04-03 14:56:58,412 INFO      DRY RUN: close: logstash-2017.03.27

curator --dry-run optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
2017-04-03 14:56:59,586 INFO      Job starting: optimize indices
2017-04-03 14:56:59,586 WARNING   Overriding default connection timeout.  New timeout: 21600
2017-04-03 14:56:59,598 INFO      Action optimize will be performed on the following indices: [u'logstash-2017.03.27', u'logstash-2017.03.28', u'logstash-2017.03.29', u'logstash-2017.03.30', u'logstash-2017.03.31', u'logstash-2017.04.01']
2017-04-03 14:56:59,598 INFO      DRY RUN MODE.  No changes will be made.
2017-04-03 14:56:59,636 INFO      DRY RUN: optimize: logstash-2017.03.27
2017-04-03 14:56:59,659 INFO      DRY RUN: optimize: logstash-2017.03.28
2017-04-03 14:56:59,685 INFO      DRY RUN: optimize: logstash-2017.03.29
2017-04-03 14:56:59,710 INFO      DRY RUN: optimize: logstash-2017.03.30
2017-04-03 14:56:59,747 INFO      DRY RUN: optimize: logstash-2017.03.31
2017-04-03 14:56:59,792 INFO      DRY RUN: optimize: logstash-2017.04.01
I can do everything manually no problem, it's just the auto part that is hurting. I see you want me to change to the user nagios and I have that user as a nologin due to STIG requirements.

Code: Select all

cat /etc/passwd |grep nagios
nagios:x:1002:1002::/home/nagios:/sbin/nologin
Before manual close:

Code: Select all

curl -XGET 'http://localhost:9200/_cat/indices/logstash-*'
green open logstash-2017.03.31 5 1 13885384 0 7.6gb   3.8gb
green open logstash-2017.03.30 5 1 12065966 0 6.1gb     3gb
green open logstash-2017.04.02 5 1 14420025 0 9.1gb   4.5gb
green open logstash-2017.04.03 5 1 12125605 0 6.7gb   3.3gb
green open logstash-2017.04.01 5 1 14495321 0 8.8gb   4.4gb
green open logstash-2017.03.28 5 1  3250958 0 1.4gb 723.7mb
green open logstash-2017.03.27 5 1  3385055 0 1.4gb   755mb
green open logstash-2017.03.29 5 1  7226281 0 3.6gb   1.8gb
Close debug request:

Code: Select all

curator --debug close indices --older-than 7 --time-unit days --timestring %Y.%m.%d
2017-04-03 15:01:30,176 DEBUG         curator.api.filter         get_date_regex:158  regex = \d{4}\.\d{2}\.\d{2}
2017-04-03 15:01:30,176 DEBUG          curator.cli.utils        filter_callback:189  REGEX = (?P<date>\d{4}\.\d{2}\.\d{2})
2017-04-03 15:01:30,176 DEBUG          curator.cli.utils        filter_callback:192  Added filter: {'pattern': '(?P<date>\\d{4}\\.\\d{2}\\.\\d{2})', 'value': 7, 'groupname': 'date', 'time_unit': 'days', 'timestring': u'%Y.%m.%d', 'method': 'older_than'}
2017-04-03 15:01:30,176 DEBUG          curator.cli.utils        filter_callback:193  New list of filters: [{'pattern': '(?P<date>\\d{4}\\.\\d{2}\\.\\d{2})', 'value': 7, 'groupname': 'date', 'time_unit': 'days', 'timestring': u'%Y.%m.%d', 'method': 'older_than'}]
2017-04-03 15:01:30,176 INFO      curator.cli.index_selection                indices:57   Job starting: close indices
2017-04-03 15:01:30,176 DEBUG     curator.cli.index_selection                indices:60   Params: {'url_prefix': u'', 'http_auth': None, 'dry_run': False, 'certificate': None, 'loglevel': u'INFO', 'logformat': u'default', 'quiet': False, 'host': u'localhost', 'timeout': 30, 'debug': True, 'use_ssl': False, 'logfile': None, 'master_only': False, 'port': 9200, 'ssl_no_validate': False}
2017-04-03 15:01:30,176 DEBUG          curator.cli.utils             get_client:112  kwargs = {'url_prefix': u'', 'http_auth': None, 'dry_run': False, 'certificate': None, 'loglevel': u'INFO', 'quiet': False, 'debug': True, 'logformat': u'default', 'timeout': 30, 'host': u'localhost', 'use_ssl': False, 'logfile': None, 'master_only': False, 'port': 9200, 'ssl_no_validate': False}
2017-04-03 15:01:30,177 DEBUG         urllib3.util.retry               from_int:191  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0)
2017-04-03 15:01:30,178 DEBUG     urllib3.connectionpool              _new_conn:212  Starting new HTTP connection (1): localhost
2017-04-03 15:01:30,185 DEBUG     urllib3.connectionpool          _make_request:400  http://localhost:9200 "GET / HTTP/1.1" 200 387
2017-04-03 15:01:30,185 INFO               elasticsearch    log_request_success:63   GET http://localhost:9200/ [status:200 request:0.009s]
2017-04-03 15:01:30,185 DEBUG              elasticsearch    log_request_success:65   > None
2017-04-03 15:01:30,186 DEBUG              elasticsearch    log_request_success:66   < {
  "status" : 200,
  "name" : "4203c74b-084b-44a1-a358-165b46bdf788",
  "cluster_name" : "9e774f81-3725-4eb9-8072-d7aaf46bd83b",
  "version" : {
    "number" : "1.6.0",
    "build_hash" : "cdd3ac4dde4f69524ec0a14de3828cb95bbb86d0",
    "build_timestamp" : "2015-06-09T13:36:34Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

2017-04-03 15:01:30,186 DEBUG          curator.cli.utils          check_version:90   Detected Elasticsearch version 1.6.0
2017-04-03 15:01:30,186 DEBUG         urllib3.util.retry               from_int:191  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0)
2017-04-03 15:01:30,188 DEBUG     urllib3.connectionpool          _make_request:400  http://localhost:9200 "GET /_all/_settings?expand_wildcards=open%2Cclosed HTTP/1.1" 200 2281
2017-04-03 15:01:30,188 INFO               elasticsearch    log_request_success:63   GET http://localhost:9200/_all/_settings?expand_wildcards=open%2Cclosed [status:200 request:0.002s]
2017-04-03 15:01:30,188 DEBUG              elasticsearch    log_request_success:65   > None
2017-04-03 15:01:30,188 DEBUG              elasticsearch    log_request_success:66   < {"logstash-2017.03.30":{"settings":{"index":{"refresh_interval":"5s","creation_date":"1490831956440","number_of_shards":"5","number_of_replicas":"1","version":{"created":"1060099"},"uuid":"MAn3tbk0T1C7tGqIyNgVZA"}}},"logstash-2017.04.02":{"settings":{"index":{"refresh_interval":"5s","creation_date":"1491091155452","number_of_shards":"5","number_of_replicas":"1","version":{"created":"1060099"},"uuid":"cTXGz3sTQvSDy_wmyy7vNw"}}},"kibana-int":{"settings":{"index":{"creation_date":"1486764425463","number_of_shards":"5","uuid":"vvCUBmXnRN67A5j1tOfTww","version":{"created":"1060099"},"number_of_replicas":"1"}}},"logstash-2017.03.31":{"settings":{"index":{"refresh_interval":"5s","creation_date":"1490918356157","number_of_shards":"5","number_of_replicas":"1","version":{"created":"1060099"},"uuid":"Q-m0p_C4RtqtW1sMP55r0w"}}},"logstash-2017.03.29":{"settings":{"index":{"refresh_interval":"5s","creation_date":"1490745600190","number_of_shards":"5","number_of_replicas":"1","version":{"created":"1060099"},"uuid":"dil1ZGLiRSOf_yicud5wuw"}}},"nagioslogserver_log":{"settings":{"index":{"creation_date":"1486764417796","number_of_shards":"5","uuid":"gjJu-RFCT9yXc_kXGUvuog","version":{"created":"1060099"},"number_of_replicas":"1"}}},"nagioslogserver":{"settings":{"index":{"creation_date":"1486764417262","number_of_shards":"5","uuid":"O_LokZGJTca1oFSJnt6JJw","version":{"created":"1060099"},"number_of_replicas":"1"}}},"logstash-2017.03.28":{"settings":{"index":{"refresh_interval":"5s","creation_date":"1490659200705","number_of_shards":"5","number_of_replicas":"1","version":{"created":"1060099"},"uuid":"OqPF-j7-TQKHKz3OZxNpXw"}}},"logstash-2017.04.03":{"settings":{"index":{"refresh_interval":"5s","creation_date":"1491177553938","number_of_shards":"5","number_of_replicas":"1","version":{"created":"1060099"},"uuid":"P04kUAblRvijWeJ2VMepAA"}}},"logstash-2017.03.27":{"settings":{"index":{"refresh_interval":"5s","creation_date":"1490572800127","number_of_shards":"5","number_of_replicas":"1","version":{"created":"1060099"},"uuid":"e96JYa4DR9ubBf4tNAM1Vw"}}},"logstash-2017.04.01":{"settings":{"index":{"refresh_interval":"5s","creation_date":"1491004756493","number_of_shards":"5","number_of_replicas":"1","version":{"created":"1060099"},"uuid":"YMc7da0TSt-W-YeHZHTkPQ"}}}}
2017-04-03 15:01:30,189 DEBUG          curator.api.utils            get_indices:28   All indices: [u'logstash-2017.03.31', u'logstash-2017.03.30', u'nagioslogserver', u'logstash-2017.03.28', u'logstash-2017.04.02', u'logstash-2017.04.01', u'logstash-2017.03.27', u'logstash-2017.04.03', u'logstash-2017.03.29', u'nagioslogserver_log', u'kibana-int']
2017-04-03 15:01:30,189 DEBUG     curator.cli.index_selection                indices:76   Full list of indices: [u'logstash-2017.03.31', u'logstash-2017.03.30', u'nagioslogserver', u'logstash-2017.03.28', u'logstash-2017.04.02', u'logstash-2017.04.01', u'logstash-2017.03.27', u'logstash-2017.04.03', u'logstash-2017.03.29', u'nagioslogserver_log', u'kibana-int']
2017-04-03 15:01:30,189 DEBUG     curator.cli.index_selection                indices:98   All filters: [{'pattern': '(?P<date>\\d{4}\\.\\d{2}\\.\\d{2})', 'value': 7, 'groupname': 'date', 'time_unit': 'days', 'timestring': u'%Y.%m.%d', 'method': 'older_than'}]
2017-04-03 15:01:30,189 DEBUG     curator.cli.index_selection                indices:103  Filter: {'pattern': '(?P<date>\\d{4}\\.\\d{2}\\.\\d{2})', 'value': 7, 'groupname': 'date', 'time_unit': 'days', 'timestring': u'%Y.%m.%d', 'method': 'older_than'}
2017-04-03 15:01:30,192 DEBUG         curator.api.filter        timestamp_check:301  Timestamp "2017.03.31" is outside the cutoff period (older than 7 days).
2017-04-03 15:01:30,192 DEBUG         curator.api.filter        timestamp_check:301  Timestamp "2017.03.30" is outside the cutoff period (older than 7 days).
2017-04-03 15:01:30,192 DEBUG         curator.api.filter        timestamp_check:301  Timestamp "2017.03.28" is outside the cutoff period (older than 7 days).
2017-04-03 15:01:30,192 DEBUG         curator.api.filter        timestamp_check:301  Timestamp "2017.04.02" is outside the cutoff period (older than 7 days).
2017-04-03 15:01:30,192 DEBUG         curator.api.filter        timestamp_check:301  Timestamp "2017.04.01" is outside the cutoff period (older than 7 days).
2017-04-03 15:01:30,192 DEBUG         curator.api.filter        timestamp_check:301  Timestamp "2017.04.03" is outside the cutoff period (older than 7 days).
2017-04-03 15:01:30,192 DEBUG         curator.api.filter        timestamp_check:301  Timestamp "2017.03.29" is outside the cutoff period (older than 7 days).
2017-04-03 15:01:30,192 INFO      curator.cli.index_selection                indices:144  Action close will be performed on the following indices: [u'logstash-2017.03.27']
2017-04-03 15:01:30,193 DEBUG         urllib3.util.retry               from_int:191  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0)
2017-04-03 15:01:30,193 DEBUG     urllib3.connectionpool          _make_request:400  http://localhost:9200 "GET / HTTP/1.1" 200 387
2017-04-03 15:01:30,193 INFO               elasticsearch    log_request_success:63   GET http://localhost:9200/ [status:200 request:0.001s]
2017-04-03 15:01:30,193 DEBUG              elasticsearch    log_request_success:65   > None
2017-04-03 15:01:30,193 DEBUG              elasticsearch    log_request_success:66   < {
  "status" : 200,
  "name" : "4203c74b-084b-44a1-a358-165b46bdf788",
  "cluster_name" : "9e774f81-3725-4eb9-8072-d7aaf46bd83b",
  "version" : {
    "number" : "1.6.0",
    "build_hash" : "cdd3ac4dde4f69524ec0a14de3828cb95bbb86d0",
    "build_timestamp" : "2015-06-09T13:36:34Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

2017-04-03 15:01:30,194 DEBUG         urllib3.util.retry               from_int:191  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0)
2017-04-03 15:01:30,194 DEBUG     urllib3.connectionpool          _make_request:400  http://localhost:9200 "GET / HTTP/1.1" 200 387
2017-04-03 15:01:30,195 INFO               elasticsearch    log_request_success:63   GET http://localhost:9200/ [status:200 request:0.001s]
2017-04-03 15:01:30,195 DEBUG              elasticsearch    log_request_success:65   > None
2017-04-03 15:01:30,195 DEBUG              elasticsearch    log_request_success:66   < {
  "status" : 200,
  "name" : "4203c74b-084b-44a1-a358-165b46bdf788",
  "cluster_name" : "9e774f81-3725-4eb9-8072-d7aaf46bd83b",
  "version" : {
    "number" : "1.6.0",
    "build_hash" : "cdd3ac4dde4f69524ec0a14de3828cb95bbb86d0",
    "build_timestamp" : "2015-06-09T13:36:34Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

2017-04-03 15:01:30,195 DEBUG         urllib3.util.retry               from_int:191  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0)
2017-04-03 15:01:30,254 DEBUG     urllib3.connectionpool          _make_request:400  http://localhost:9200 "GET /_cat/indices/logstash-2017.03.27?h=status&format=json HTTP/1.1" 200 19
2017-04-03 15:01:30,254 INFO               elasticsearch    log_request_success:63   GET http://localhost:9200/_cat/indices/logstash-2017.03.27?h=status&format=json [status:200 request:0.059s]
2017-04-03 15:01:30,254 DEBUG              elasticsearch    log_request_success:65   > None
2017-04-03 15:01:30,254 DEBUG              elasticsearch    log_request_success:66   < [{"status":"open"}]
2017-04-03 15:01:30,255 DEBUG          curator.api.utils   prune_open_or_closed:328  Including index logstash-2017.03.27: Opened.
2017-04-03 15:01:30,255 DEBUG         urllib3.util.retry               from_int:191  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0)
2017-04-03 15:01:30,255 DEBUG     urllib3.connectionpool          _make_request:400  http://localhost:9200 "GET / HTTP/1.1" 200 387
2017-04-03 15:01:30,256 INFO               elasticsearch    log_request_success:63   GET http://localhost:9200/ [status:200 request:0.001s]
2017-04-03 15:01:30,256 DEBUG              elasticsearch    log_request_success:65   > None
2017-04-03 15:01:30,256 DEBUG              elasticsearch    log_request_success:66   < {
  "status" : 200,
  "name" : "4203c74b-084b-44a1-a358-165b46bdf788",
  "cluster_name" : "9e774f81-3725-4eb9-8072-d7aaf46bd83b",
  "version" : {
    "number" : "1.6.0",
    "build_hash" : "cdd3ac4dde4f69524ec0a14de3828cb95bbb86d0",
    "build_timestamp" : "2015-06-09T13:36:34Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

2017-04-03 15:01:30,256 DEBUG         urllib3.util.retry               from_int:191  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0)
2017-04-03 15:01:30,397 DEBUG     urllib3.connectionpool          _make_request:400  http://localhost:9200 "POST /logstash-2017.03.27/_flush/synced HTTP/1.1" 200 113
2017-04-03 15:01:30,398 INFO               elasticsearch    log_request_success:63   POST http://localhost:9200/logstash-2017.03.27/_flush/synced [status:200 request:0.142s]
2017-04-03 15:01:30,398 DEBUG              elasticsearch    log_request_success:65   > None
2017-04-03 15:01:30,398 DEBUG              elasticsearch    log_request_success:66   < {"_shards":{"total":10,"successful":10,"failed":0},"logstash-2017.03.27":{"total":10,"successful":10,"failed":0}}
2017-04-03 15:01:30,398 INFO            curator.api.seal           seal_indices:49   Provided indices successfully sealed. (Shown with --debug flag enabled.)
2017-04-03 15:01:30,398 DEBUG           curator.api.seal           seal_indices:51   Successfully sealed indices: [u'logstash-2017.03.27']
2017-04-03 15:01:30,399 DEBUG         urllib3.util.retry               from_int:191  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0)
2017-04-03 15:01:30,777 DEBUG     urllib3.connectionpool          _make_request:400  http://localhost:9200 "POST /logstash-2017.03.27/_close?ignore_unavailable=true HTTP/1.1" 200 21
2017-04-03 15:01:30,777 INFO               elasticsearch    log_request_success:63   POST http://localhost:9200/logstash-2017.03.27/_close?ignore_unavailable=true [status:200 request:0.378s]
2017-04-03 15:01:30,777 DEBUG              elasticsearch    log_request_success:65   > None
2017-04-03 15:01:30,777 DEBUG              elasticsearch    log_request_success:66   < {"acknowledged":true}
2017-04-03 15:01:30,777 INFO           curator.cli.utils               exit_msg:67   Job completed successfully.
After close:

Code: Select all

curl -XGET 'http://localhost:9200/_cat/indices/logstash-*'           green open logstash-2017.03.31 5 1 13885384 0 7.6gb   3.8gb
green open logstash-2017.03.30 5 1 12065966 0 6.1gb     3gb
green open logstash-2017.04.02 5 1 14420025 0 9.1gb   4.5gb
green open logstash-2017.04.03 5 1 12139530 0 6.7gb   3.3gb
green open logstash-2017.04.01 5 1 14495321 0 8.8gb   4.4gb
green open logstash-2017.03.28 5 1  3250958 0 1.4gb 723.7mb
green open logstash-2017.03.29 5 1  7226281 0 3.6gb   1.8gb
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Backup & Maintenance not working automatically

Post by mcapra »

TEWLS wrote:We changed document.ready to window.load.
I don't believe this would impact the command subsystem on the back-end. There's no jQuery happening anywhere in that business.
TEWLS wrote:I see you want me to change to the user nagios and I have that user as a nologin due to STIG requirements.
This is likely the root cause. The cron jobs for the command subsystem are running under the nagios user. The backup_maintenance job is doing PHP exec calls of those aforementioned curator commands. If PHP doesn't have a shell to execute the curator commands from, they'll probably fail. It shows as "SUCCESS" in the GUI for the jobs status because the command technically ran correctly as far as the PHP is concerned (this is changing in a future version).
Former Nagios employee
https://www.mcapra.com/
TEWLS
Posts: 33
Joined: Wed Dec 28, 2016 3:53 pm

Re: Backup & Maintenance not working automatically

Post by TEWLS »

So essentially I will have to either make nagios a user or run these on my own once a day. Are there any issues with doing them as a cronjob I set myself from one of the units?

Code: Select all

curator optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
curator close indices --older-than 7 --time-unit days --timestring %Y.%m.%d
curator delete indices --older-than 7 --time-unit days --timestring %Y.%m.%d
curator snapshot --repository nfs_nls_backups indices --older-than 1 --time-unit days --timestring %Y.%m.%d
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Backup & Maintenance not working automatically

Post by mcapra »

TEWLS wrote:Are there any issues with doing them as a cronjob I set myself from one of the units?
You'll need to be mindful of re-creating those cron jobs every time you do an upgrade on these systems. Also you'll need to be sure you don't lose the node with these curator commands configured or you won't have any backup/maintenance stuff being run. That's about it.
Former Nagios employee
https://www.mcapra.com/
Locked