Page 1 of 2

Nagios Logging - System Jobs under Command Subsystem

Posted: Thu Jul 09, 2020 9:34 pm
by nguyenhung12345
Hi team,

Can you please advise why all System Jobs under Command Subsystem are being "Waiting Status"? Can I troubleshot further on it and find root cause?

Furthermore, I am working on project to pull all snapshots (in mounted Snapshot_Repository) to a local (Windows OS) -- which is it can be process automatically? Can you also support on it please.

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Fri Jul 10, 2020 1:12 pm
by cdienger
The waiting status is the usual state of things when they are not running.

Please open a new thread for each separate issue. It makes it easier to troubleshoot and help if we're not trying to address multiple issues in a single thread. That said, few questions that you can answer in the new thread:

Is the Snapshot_Repository currently mounted and saving snapshots?
Is the repository on a Linux machine other than the NLS machine?
Is there reason you don't just mount a Windows share and save the snapshots to a Windows machine?

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Sun Jul 12, 2020 8:34 pm
by nguyenhung12345
Yeah, I will open new thread in pulling all snapshots.
With System Jobs issue, does there can support from here? This is my answer per each:

Is the Snapshot_Repository currently mounted and saving snapshots?
Yes, we mounted all nodes to cluster using same location using NFS: /mnt/snapshot_repo

Is the repository on a Linux machine other than the NLS machine?
Original repo on a Linux machine.

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Mon Jul 13, 2020 2:34 am
by nguyenhung12345
Hey,

This one I found when we do tail the live logs. After Reset Job and then Run "snapshots_maintenance" scheduler manually.

Does there related to previous thread that I found (https://support.nagios.com/forum/viewto ... NG#p298281)?
Running command do_maintenance with args ' ' for job id: snapshots_maintenance
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Running command do_maintenance with args ' ' for job id: snapshots_maintenance
Traceback (most recent call last):
File "/usr/local/bin/curator", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/curator/curator.py", line 5, in main
cli( obj={ "filters": [] } )
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/curator/cli/index_selection.py", line 167, in indices
retval = do_command(client, ctx.parent.info_name, working_list, ctx.parent.params, master_timeout)
File "/usr/local/lib/python2.7/dist-packages/curator/cli/utils.py", line 235, in do_command
delay=params['delay'], request_timeout=params['request_timeout']
File "/usr/local/lib/python2.7/dist-packages/curator/api/optimize.py", line 50, in optimize
request_timeout=request_timeout
File "/usr/local/lib/python2.7/dist-packages/curator/api/optimize.py", line 19, in optimize_index
if optimized(client, index_name, max_num_segments):
File "/usr/local/lib/python2.7/dist-packages/curator/api/utils.py", line 168, in optimized
shards, segmentcount = get_segmentcount(client, index_name)
File "/usr/local/lib/python2.7/dist-packages/curator/api/utils.py", line 119, in get_segmentcount
shards = client.indices.segments(index=index_name)['indices'][index_name]['shards']
KeyError: u'logstash-2020.07.04'
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
Return: 1
2020-07-13 07:29:05,687 INFO Job starting: optimize indices
2020-07-13 07:29:05,687 WARNING Overriding default connection timeout. New timeout: 21600
2020-07-13 07:29:05,803 INFO Action optimize will be performed on the following indices: [u'logstash-2020.03.09', u'logstash-2020.03.11', u'logstash-2020.03.12', u'logstash-2020.03.13', u'logstash-2020.03.14', u'logstash-2020.03.15', u'logstash-2020.03.16', u'logstash-2020.03.17', u'logstash-2020.03.18', u'logstash-2020.03.19', u'logstash-2020.03.20', u'logstash-2020.03.21', u'logstash-2020.03.22', u'logstash-2020.03.23', u'logstash-2020.03.24', u'logstash-2020.03.25', u'logstash-2020.03.26', u'logstash-2020.03.27', u'logstash-2020.03.28', u'logstash-2020.03.29', u'logstash-2020.03.30', u'logstash-2020.03.31', u'logstash-2020.04.01', u'logstash-2020.04.02', u'logstash-2020.04.03', u'logstash-2020.04.04', u'logstash-2020.04.05', u'logstash-2020.04.06', u'logstash-2020.04.07', u'logstash-2020.04.08', u'logstash-2020.04.09', u'logstash-2020.04.10', u'logstash-2020.04.11', u'logstash-2020.04.12', u'logstash-2020.04.13', u'logstash-2020.04.14', u'logstash-2020.04.15', u'logstash-2020.04.16', u'logstash-2020.04.17', u'logstash-2020.04.18', u'logstash-2020.04.19', u'logstash-2020.04.20', u'logstash-2020.04.21', u'logstash-2020.04.22', u'logstash-2020.04.23', u'logstash-2020.04.24', u'logstash-2020.04.25', u'logstash-2020.04.26', u'logstash-2020.04.27', u'logstash-2020.04.28', u'logstash-2020.04.29', u'logstash-2020.04.30', u'logstash-2020.05.01', u'logstash-2020.05.02', u'logstash-2020.05.03', u'logstash-2020.05.04', u'logstash-2020.05.05', u'logstash-2020.05.06', u'logstash-2020.05.07', u'logstash-2020.05.08', u'logstash-2020.05.09', u'logstash-2020.05.10', u'logstash-2020.05.11', u'logstash-2020.05.12', u'logstash-2020.05.13', u'logstash-2020.05.14', u'logstash-2020.05.15', u'logstash-2020.05.16', u'logstash-2020.05.17', u'logstash-2020.05.18', u'logstash-2020.05.19', u'logstash-2020.05.20', u'logstash-2020.05.21', u'logstash-2020.05.22', u'logstash-2020.05.23', u'logstash-2020.05.24', u'logstash-2020.05.25', u'logstash-2020.05.26', u'logstash-2020.05.27', u'logstash-2020.05.28', u'logstash-2020.05.29', u'logstash-2020.05.30', u'logstash-2020.05.31', u'logstash-2020.06.01', u'logstash-2020.06.02', u'logstash-2020.06.03', u'logstash-2020.06.04', u'logstash-2020.06.05', u'logstash-2020.06.06', u'logstash-2020.06.07', u'logstash-2020.06.08', u'logstash-2020.06.09', u'logstash-2020.06.10', u'logstash-2020.06.11', u'logstash-2020.06.12', u'logstash-2020.06.13', u'logstash-2020.06.14', u'logstash-2020.06.15', u'logstash-2020.06.16', u'logstash-2020.06.17', u'logstash-2020.06.18', u'logstash-2020.06.19', u'logstash-2020.06.20', u'logstash-2020.06.21', u'logstash-2020.06.22', u'logstash-2020.06.23', u'logstash-2020.06.24', u'logstash-2020.06.25', u'logstash-2020.06.26', u'logstash-2020.06.27', u'logstash-2020.06.28', u'logstash-2020.06.29', u'logstash-2020.06.30', u'logstash-2020.07.01', u'logstash-2020.07.02', u'logstash-2020.07.03', u'logstash-2020.07.04', u'logstash-2020.07.05', u'logstash-2020.07.06', u'logstash-2020.07.07', u'logstash-2020.07.08', u'logstash-2020.07.09', u'logstash-2020.07.10', u'logstash-2020.07.11']
-----
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh close indices --older-than 30 --time-unit days --timestring %Y.%m.%d
Return: 0
2020-07-13 07:30:04,201 INFO Job starting: close indices
2020-07-13 07:30:04,316 INFO Action close will be performed on the following indices: [u'logstash-2020.03.09', u'logstash-2020.03.11', u'logstash-2020.03.12', u'logstash-2020.03.13', u'logstash-2020.03.14', u'logstash-2020.03.15', u'logstash-2020.03.16', u'logstash-2020.03.17', u'logstash-2020.03.18', u'logstash-2020.03.19', u'logstash-2020.03.20', u'logstash-2020.03.21', u'logstash-2020.03.22', u'logstash-2020.03.23', u'logstash-2020.03.24', u'logstash-2020.03.25', u'logstash-2020.03.26', u'logstash-2020.03.27', u'logstash-2020.03.28', u'logstash-2020.03.29', u'logstash-2020.03.30', u'logstash-2020.03.31', u'logstash-2020.04.01', u'logstash-2020.04.02', u'logstash-2020.04.03', u'logstash-2020.04.04', u'logstash-2020.04.05', u'logstash-2020.04.06', u'logstash-2020.04.07', u'logstash-2020.04.08', u'logstash-2020.04.09', u'logstash-2020.04.10', u'logstash-2020.04.11', u'logstash-2020.04.12', u'logstash-2020.04.13', u'logstash-2020.04.14', u'logstash-2020.04.15', u'logstash-2020.04.16', u'logstash-2020.04.17', u'logstash-2020.04.18', u'logstash-2020.04.19', u'logstash-2020.04.20', u'logstash-2020.04.21', u'logstash-2020.04.22', u'logstash-2020.04.23', u'logstash-2020.04.24', u'logstash-2020.04.25', u'logstash-2020.04.26', u'logstash-2020.04.27', u'logstash-2020.04.28', u'logstash-2020.04.29', u'logstash-2020.04.30', u'logstash-2020.05.01', u'logstash-2020.05.02', u'logstash-2020.05.03', u'logstash-2020.05.04', u'logstash-2020.05.05', u'logstash-2020.05.06', u'logstash-2020.05.07', u'logstash-2020.05.08', u'logstash-2020.05.09', u'logstash-2020.05.10', u'logstash-2020.05.11', u'logstash-2020.05.12', u'logstash-2020.05.13', u'logstash-2020.05.14', u'logstash-2020.05.15', u'logstash-2020.05.16', u'logstash-2020.05.17', u'logstash-2020.05.18', u'logstash-2020.05.19', u'logstash-2020.05.20', u'logstash-2020.05.21', u'logstash-2020.05.22', u'logstash-2020.05.23', u'logstash-2020.05.24', u'logstash-2020.05.25', u'logstash-2020.05.26', u'logstash-2020.05.27', u'logstash-2020.05.28', u'logstash-2020.05.29', u'logstash-2020.05.30', u'logstash-2020.05.31', u'logstash-2020.06.01', u'logstash-2020.06.02', u'logstash-2020.06.03', u'logstash-2020.06.04', u'logstash-2020.06.05', u'logstash-2020.06.06', u'logstash-2020.06.07', u'logstash-2020.06.08', u'logstash-2020.06.09', u'logstash-2020.06.10', u'logstash-2020.06.11', u'logstash-2020.06.12', u'logstash-2020.06.13']
2020-07-13 07:30:58,068 WARNING No indices to seal.
2020-07-13 07:30:58,177 INFO Job completed successfully.
-----
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh snapshot --repository "Snapshot Repository (to Windows)" --ignore_unavailable indices --older-than 1 --time-unit days --timestring %Y.%m.%d
Return: 1
2020-07-13 07:30:59,386 INFO Job starting: snapshot indices
2020-07-13 07:30:59,387 WARNING Overriding default connection timeout. New timeout: 21600
2020-07-13 07:30:59,503 INFO Action snapshot will be performed on the following indices: [u'logstash-2020.03.09', u'logstash-2020.03.11', u'logstash-2020.03.12', u'logstash-2020.03.13', u'logstash-2020.03.14', u'logstash-2020.03.15', u'logstash-2020.03.16', u'logstash-2020.03.17', u'logstash-2020.03.18', u'logstash-2020.03.19', u'logstash-2020.03.20', u'logstash-2020.03.21', u'logstash-2020.03.22', u'logstash-2020.03.23', u'logstash-2020.03.24', u'logstash-2020.03.25', u'logstash-2020.03.26', u'logstash-2020.03.27', u'logstash-2020.03.28', u'logstash-2020.03.29', u'logstash-2020.03.30', u'logstash-2020.03.31', u'logstash-2020.04.01', u'logstash-2020.04.02', u'logstash-2020.04.03', u'logstash-2020.04.04', u'logstash-2020.04.05', u'logstash-2020.04.06', u'logstash-2020.04.07', u'logstash-2020.04.08', u'logstash-2020.04.09', u'logstash-2020.04.10', u'logstash-2020.04.11', u'logstash-2020.04.12', u'logstash-2020.04.13', u'logstash-2020.04.14', u'logstash-2020.04.15', u'logstash-2020.04.16', u'logstash-2020.04.17', u'logstash-2020.04.18', u'logstash-2020.04.19', u'logstash-2020.04.20', u'logstash-2020.04.21', u'logstash-2020.04.22', u'logstash-2020.04.23', u'logstash-2020.04.24', u'logstash-2020.04.25', u'logstash-2020.04.26', u'logstash-2020.04.27', u'logstash-2020.04.28', u'logstash-2020.04.29', u'logstash-2020.04.30', u'logstash-2020.05.01', u'logstash-2020.05.02', u'logstash-2020.05.03', u'logstash-2020.05.04', u'logstash-2020.05.05', u'logstash-2020.05.06', u'logstash-2020.05.07', u'logstash-2020.05.08', u'logstash-2020.05.09', u'logstash-2020.05.10', u'logstash-2020.05.11', u'logstash-2020.05.12', u'logstash-2020.05.13', u'logstash-2020.05.14', u'logstash-2020.05.15', u'logstash-2020.05.16', u'logstash-2020.05.17', u'logstash-2020.05.18', u'logstash-2020.05.19', u'logstash-2020.05.20', u'logstash-2020.05.21', u'logstash-2020.05.22', u'logstash-2020.05.23', u'logstash-2020.05.24', u'logstash-2020.05.25', u'logstash-2020.05.26', u'logstash-2020.05.27', u'logstash-2020.05.28', u'logstash-2020.05.29', u'logstash-2020.05.30', u'logstash-2020.05.31', u'logstash-2020.06.01', u'logstash-2020.06.02', u'logstash-2020.06.03', u'logstash-2020.06.04', u'logstash-2020.06.05', u'logstash-2020.06.06', u'logstash-2020.06.07', u'logstash-2020.06.08', u'logstash-2020.06.09', u'logstash-2020.06.10', u'logstash-2020.06.11', u'logstash-2020.06.12', u'logstash-2020.06.13', u'logstash-2020.06.14', u'logstash-2020.06.15', u'logstash-2020.06.16', u'logstash-2020.06.17', u'logstash-2020.06.18', u'logstash-2020.06.19', u'logstash-2020.06.20', u'logstash-2020.06.21', u'logstash-2020.06.22', u'logstash-2020.06.23', u'logstash-2020.06.24', u'logstash-2020.06.25', u'logstash-2020.06.26', u'logstash-2020.06.27', u'logstash-2020.06.28', u'logstash-2020.06.29', u'logstash-2020.06.30', u'logstash-2020.07.01', u'logstash-2020.07.02', u'logstash-2020.07.03', u'logstash-2020.07.04', u'logstash-2020.07.05', u'logstash-2020.07.06', u'logstash-2020.07.07', u'logstash-2020.07.08', u'logstash-2020.07.09', u'logstash-2020.07.10', u'logstash-2020.07.11', u'logstash-2020.07.12']
2020-07-13 07:30:59,609 ERROR Snapshot already in progress: [{u'stats': {u'number_of_files': 0, u'total_size_in_bytes': 0, u'time_in_millis': 0, u'processed_files': 0, u'processed_size_in_bytes': 0, u'start_time_in_millis': 0}, u'repository': u'Snapshot Repository (to Windows)', u'shards_stats': {u'started': 0, u'failed': 0, u'done': 0, u'finalizing': 0, u'initializing': 0, u'total': 0}, u'state': u'INIT', u'snapshot': u'curator-20200713073026', u'indices': {}}]
2020-07-13 07:30:59,610 WARNING Job did not complete successfully.
-----
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Running command do_maintenance with args ' ' for job id: snapshots_maintenance
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh delete snapshots --older-than 60 --time-unit days --timestring %Y%m%d --repository "Snapshot Repository (to Windows)"
Return: 0
2020-07-13 07:30:59,808 INFO Job starting: delete snapshots
2020-07-13 07:31:03,680 WARNING No snapshots matched provided args.
No snapshots matched provided args.
-----
SUCCESS
Processed 0 node jobs.
Processed 2 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Traceback (most recent call last):
File "/usr/local/bin/curator", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/curator/curator.py", line 5, in main
cli( obj={ "filters": [] } )
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/curator/cli/index_selection.py", line 167, in indices
retval = do_command(client, ctx.parent.info_name, working_list, ctx.parent.params, master_timeout)
File "/usr/local/lib/python2.7/dist-packages/curator/cli/utils.py", line 235, in do_command
delay=params['delay'], request_timeout=params['request_timeout']
File "/usr/local/lib/python2.7/dist-packages/curator/api/optimize.py", line 50, in optimize
request_timeout=request_timeout
File "/usr/local/lib/python2.7/dist-packages/curator/api/optimize.py", line 19, in optimize_index
if optimized(client, index_name, max_num_segments):
File "/usr/local/lib/python2.7/dist-packages/curator/api/utils.py", line 168, in optimized
shards, segmentcount = get_segmentcount(client, index_name)
File "/usr/local/lib/python2.7/dist-packages/curator/api/utils.py", line 119, in get_segmentcount
shards = client.indices.segments(index=index_name)['indices'][index_name]['shards']
KeyError: u'logstash-2020.07.04'
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
Return: 1
2020-07-13 07:31:02,614 INFO Job starting: optimize indices
2020-07-13 07:31:02,615 WARNING Overriding default connection timeout. New timeout: 21600
2020-07-13 07:31:02,730 INFO Action optimize will be performed on the following indices: [u'logstash-2020.03.09', u'logstash-2020.03.11', u'logstash-2020.03.12', u'logstash-2020.03.13', u'logstash-2020.03.14', u'logstash-2020.03.15', u'logstash-2020.03.16', u'logstash-2020.03.17', u'logstash-2020.03.18', u'logstash-2020.03.19', u'logstash-2020.03.20', u'logstash-2020.03.21', u'logstash-2020.03.22', u'logstash-2020.03.23', u'logstash-2020.03.24', u'logstash-2020.03.25', u'logstash-2020.03.26', u'logstash-2020.03.27', u'logstash-2020.03.28', u'logstash-2020.03.29', u'logstash-2020.03.30', u'logstash-2020.03.31', u'logstash-2020.04.01', u'logstash-2020.04.02', u'logstash-2020.04.03', u'logstash-2020.04.04', u'logstash-2020.04.05', u'logstash-2020.04.06', u'logstash-2020.04.07', u'logstash-2020.04.08', u'logstash-2020.04.09', u'logstash-2020.04.10', u'logstash-2020.04.11', u'logstash-2020.04.12', u'logstash-2020.04.13', u'logstash-2020.04.14', u'logstash-2020.04.15', u'logstash-2020.04.16', u'logstash-2020.04.17', u'logstash-2020.04.18', u'logstash-2020.04.19', u'logstash-2020.04.20', u'logstash-2020.04.21', u'logstash-2020.04.22', u'logstash-2020.04.23', u'logstash-2020.04.24', u'logstash-2020.04.25', u'logstash-2020.04.26', u'logstash-2020.04.27', u'logstash-2020.04.28', u'logstash-2020.04.29', u'logstash-2020.04.30', u'logstash-2020.05.01', u'logstash-2020.05.02', u'logstash-2020.05.03', u'logstash-2020.05.04', u'logstash-2020.05.05', u'logstash-2020.05.06', u'logstash-2020.05.07', u'logstash-2020.05.08', u'logstash-2020.05.09', u'logstash-2020.05.10', u'logstash-2020.05.11', u'logstash-2020.05.12', u'logstash-2020.05.13', u'logstash-2020.05.14', u'logstash-2020.05.15', u'logstash-2020.05.16', u'logstash-2020.05.17', u'logstash-2020.05.18', u'logstash-2020.05.19', u'logstash-2020.05.20', u'logstash-2020.05.21', u'logstash-2020.05.22', u'logstash-2020.05.23', u'logstash-2020.05.24', u'logstash-2020.05.25', u'logstash-2020.05.26', u'logstash-2020.05.27', u'logstash-2020.05.28', u'logstash-2020.05.29', u'logstash-2020.05.30', u'logstash-2020.05.31', u'logstash-2020.06.01', u'logstash-2020.06.02', u'logstash-2020.06.03', u'logstash-2020.06.04', u'logstash-2020.06.05', u'logstash-2020.06.06', u'logstash-2020.06.07', u'logstash-2020.06.08', u'logstash-2020.06.09', u'logstash-2020.06.10', u'logstash-2020.06.11', u'logstash-2020.06.12', u'logstash-2020.06.13', u'logstash-2020.06.14', u'logstash-2020.06.15', u'logstash-2020.06.16', u'logstash-2020.06.17', u'logstash-2020.06.18', u'logstash-2020.06.19', u'logstash-2020.06.20', u'logstash-2020.06.21', u'logstash-2020.06.22', u'logstash-2020.06.23', u'logstash-2020.06.24', u'logstash-2020.06.25', u'logstash-2020.06.26', u'logstash-2020.06.27', u'logstash-2020.06.28', u'logstash-2020.06.29', u'logstash-2020.06.30', u'logstash-2020.07.01', u'logstash-2020.07.02', u'logstash-2020.07.03', u'logstash-2020.07.04', u'logstash-2020.07.05', u'logstash-2020.07.06', u'logstash-2020.07.07', u'logstash-2020.07.08', u'logstash-2020.07.09', u'logstash-2020.07.10', u'logstash-2020.07.11']
-----
-----
Running cmd: /usr/local/nagioslogserver/scripts/curator.sh delete snapshots --older-than 60 --time-unit days --timestring %Y%m%d --repository "Snapshot Repository (to Windows)"
Return: 0
2020-07-13 07:32:01,100 INFO Job starting: delete snapshots
2020-07-13 07:32:04,988 WARNING No snapshots matched provided args.
No snapshots matched provided args.

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Mon Jul 13, 2020 4:46 pm
by cdienger
Please PM me a profile from the system. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:

Code: Select all

/usr/local/nagioslogserver/scripts/profile.sh
This will create /tmp/system-profile.tar.gz.

Note that this file can be very large and may not be able to be uploaded through the system. This is usually due to the logs in the Logstash and/or Elasticsearch directories found in it. If it is too large, please open the profile, extract these directories/files and send them separately.

I'd also like to get a copy of the current settings index. This can be gathered by running:

Code: Select all

curl -XPOST http://localhost:9200/nagioslogserver/_export?path=/tmp/nagioslogserver.tar.gz
The file it creates and that we'd like to see is /tmp/nagioslogserver.tar.gz.

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Mon Jul 13, 2020 8:27 pm
by nguyenhung12345
Hi,

It's not large as I think. Please find attachments FYI. I ran it from Cluster, do you need us to also run it from other nodes as well?

Files received for review and have been removed from this response.

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Tue Jul 14, 2020 3:54 pm
by cdienger
You're running into disk space issues which is preventing shards from loading properly and the snapshots are failing because of the red status:

Code: Select all

[2020-07-13 06:25:28,473][WARN ][cluster.routing.allocation.decider] [92a293ca-9487-464b-964c-d9d58dee250b] high disk watermark [90%] exceeded on [lZkU14hFQuSxmQrE1Qpq9w][6b1c0aaa-e3ca-420a-b57b-9ac39159df9f] free: 96.4gb[9.5%], shards will be relocated away from this node
I would recommend freeing some space or allocating more space to the machine. You can free up space by deleting old indices under Admin > System > Cluster Status. If you're running the NLS OVA then this document will step you through allocating more space: https://support.nagios.com/kb/article.php?id=814.

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Tue Jul 14, 2020 9:52 pm
by nguyenhung12345
Hmm, I don't think so, as long as we moved shared folder to Windows device instead of mount it to NLS Cluster partition.

And currently, disk space is fine. Any other suggestion?

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Thu Jul 16, 2020 12:05 pm
by cdienger
When was the move done? The message was logged on the 13th.

What is the output of the "df -h" command on each machine?

Re: Nagios Logging - System Jobs under Command Subsystem

Posted: Thu Jul 16, 2020 3:40 pm
by cdienger
A ticket has been opened for this so we'll lock the thread and continue working the issue through the ticket.