Snapshot state shown as partial

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
cdcsysadmin
Posts: 55
Joined: Tue Dec 04, 2018 9:52 pm

Snapshot state shown as partial

Post by cdcsysadmin »

The snapshot state have been shown as partial for a few days.
What made them partial and how to resume it?
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Snapshot state shown as partial

Post by cdienger »

This will happen if a primary shard of an index isn't available during the snapshot. They're usually temporary as the elasticsearch backend will work to assign unavailable/unassigned shards. You can see a list of unassigned shards with:

Code: Select all

curl 'localhost:9200/_cat/shards?pretty' | grep -i unassigned
The 'p' or 'r' next to each shard indicates if it is a primary or redundant shard.

If it is happening frequently then there may be a resource or connectivity issue with one or more of the nodes in the cluster. If this is the case, please send me a private message with a profile from each machine.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
cdcsysadmin
Posts: 55
Joined: Tue Dec 04, 2018 9:52 pm

Re: Snapshot state shown as partial

Post by cdcsysadmin »

Profile has been sent.
Any update?
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Snapshot state shown as partial

Post by cdienger »

The profiles show that all the primary shards were assigned at the time it was generated. When was the last time there was a partial snapshot?

What repository is NLS configured to use currently(Admin > System > Snapshots & Maintenance > Maintenance and Repository Settings > Repository to store snapshots in) ? It looks like the machine has multiple repos mounted, and one of them is 100% full. You can see this by running "df -h" on the command line.

You can also try running the snapshot from the command line manually which can give us more information to work with if there is a problem. It looks like

Code: Select all

/usr/local/nagioslogserver/scripts/curator.sh snapshot --repository '$repository' --ignore_unavailable indices --older-than 1 --time-unit days --timestring %Y.%m.%d
'$repository' is the name of the repo selected under the maintenance settings page.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
cdcsysadmin
Posts: 55
Joined: Tue Dec 04, 2018 9:52 pm

Re: Snapshot state shown as partial

Post by cdcsysadmin »

There are 4 repositories.
The mountpoint I chose to store indexes was full and I alter it to another mountpoint.
New snapshot still could not be created.
Besides,when I want to housekeep the indexes which are older than 365 days in Web UI, it failed.
I just have no idea how to resume the snapshot service.

[root@nxlog02 ~]# /usr/local/nagioslogserver/scripts/curator.sh snapshot --repository '$repository' --ignore_unavailable indices --older-than 1 --time-unit days --timestring %Y.%m.%d
2021-01-18 09:55:08,127 INFO Job starting: snapshot indices
2021-01-18 09:55:08,127 WARNING Overriding default connection timeout. New timeout: 21600
2021-01-18 09:55:08,149 INFO Action snapshot will be performed on the following indices: [u'logstash-2021.01.11', u'logstash-2021.01.12', u'logstash-2021.01.13', u'logstash-2021.01.14', u'logstash-2021.01.15', u'logstash-2021.01.16', u'logstash-2021.01.17']
2021-01-18 09:55:08,970 ERROR Failed to verify all nodes have repository access.
2021-01-18 09:55:08,970 WARNING Job did not complete successfully.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Snapshot state shown as partial

Post by ssax »

This is saying that not all nodes have access to the repositories, they need to:

Code: Select all

ERROR Failed to verify all nodes have repository access.
Check the permissions on them and in the subdirectories.

Code: Select all

ls -la /full/path/to/your/mounts
cdcsysadmin
Posts: 55
Joined: Tue Dec 04, 2018 9:52 pm

Re: Snapshot state shown as partial

Post by cdcsysadmin »

does not seem to be permission issue

[root@nxlog02 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 7.8G 0 7.8G 0% /dev
tmpfs 7.8G 0 7.8G 0% /dev/shm
tmpfs 7.8G 788M 7.0G 10% /run
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/mapper/centos-root 2.1T 1.4T 548G 73% /
/dev/sda1 976M 197M 713M 22% /boot
/dev/mapper/vg_01-lv_01 4.9T 4.7T 0 100% /Snapshots05
tmpfs 1.6G 0 1.6G 0% /run/user/0
10.10.110.225:/volume4/nagioslog01 11T 9.4T 1.5T 87% /Snapshots01
10.10.110.225:/volume4/nagioslog03 11T 9.4T 1.5T 87% /Snapshots03
10.10.110.225:/volume3/nagioslog04 28T 28T 74G 100% /Snapshots04
tmpfs 1.6G 0 1.6G 0% /run/user/1000
tmpfs 1.6G 0 1.6G 0% /run/user/48
...
[root@nxlog02 ~]# ls -la /Snapshots01
total 1316
drwxrwxrwx 4 nagios nagios 40960 Jan 18 20:09 .
...
[root@nxlog02 ~]# ls -la /Snapshots03
total 156
drwxrwxrwx 4 nagios nagios 4096 Jan 6 11:41 .
...
[root@nxlog02 ~]# ls -la /Snapshots04
total 1268
drwxrwxrwx 4 nagios nagios 20480 Jan 6 11:43 .
...
[root@nxlog02 ~]# ls -la /Snapshots05
total 168
drwxrwxrwx 4 nagios nagios 4096 Jan 17 20:10 .
...
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Snapshot state shown as partial

Post by cdienger »

What repository is NLS configured to use currently(Admin > System > Snapshots & Maintenance > Maintenance and Repository Settings > Repository to store snapshots in) ?

If Snapshots05 is selected then this would be the command to run:

Code: Select all

/usr/local/nagioslogserver/scripts/curator.sh snapshot --repository 'Snapshots05' --ignore_unavailable indices --older-than 1 --time-unit days --timestring %Y.%m.%d
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked