Hi Support,
Am not quite sure where to post this but it looks like my Nagios XI is showing some backup snapshot alerts for my Log Server cluster.
Strange this is that I'm looking at Nagios Log Server and the backup snapshot was run successfully. This is occurring across two datacentres.
Where do I start to troubleshoot on this?
***** Nagios XI Alert *****
Nagios has detected a problem with this service.
Notification Type: PROBLEM
Service: Last NLS Backup
Host: hs3-nagcluster
Address: *ip removed*
State: UNKNOWN
Info:
UNKNOWN: Unable to determine result within last 25 hours: No snapshots found in Elasticsearch.
Date/Time: 21/06/2017 13:10:29
Nagios XI alert for Log Server backup failure
-
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Nagios XI alert for Log Server backup failure
You do not have the required permissions to view the files attached to this post.
Re: Nagios XI alert for Log Server backup failure
Run the following on the NLS:
REPONAME is the name of the repository as seen under Administration > System > Backup & Maintenance > Repositories. The first command is the command that the check runs and may give us more information as to why it is unable to detect a current snapshot. The second command will show us all available snapshots.
Code: Select all
curator --loglevel warn show snapshots --repository REPONAME --newer-than 25 --time-unit hours
curl -v -XGET "http://localhost:9200/_snapshot/REPONAME/_all?pretty"
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Re: Nagios XI alert for Log Server backup failure
Output as below:
root@hs3-log-01 ~]# curator --loglevel warn show snapshots --repository SharedBackupRepo --newer-than 25 --time-unit hours
2017-06-29 11:20:16,516 ERROR No snapshots found in Elasticsearch.
No snapshots found in Elasticsearch.
[root@hs3-log-01 ~]#
[root@hs3-log-01 ~]# curl -v -XGET "http://localhost:9200/_snapshot/SharedB ... all?pretty"
* About to connect() to localhost port 9200 (#0)
* Trying ::1...
* Connection refused
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 9200 (#0)
> GET /_snapshot/SharedBackupRepo/_all?pretty HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:9200
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 24
<
{
"snapshots" : [ ]
}
root@hs3-log-01 ~]# curator --loglevel warn show snapshots --repository SharedBackupRepo --newer-than 25 --time-unit hours
2017-06-29 11:20:16,516 ERROR No snapshots found in Elasticsearch.
No snapshots found in Elasticsearch.
[root@hs3-log-01 ~]#
[root@hs3-log-01 ~]# curl -v -XGET "http://localhost:9200/_snapshot/SharedB ... all?pretty"
* About to connect() to localhost port 9200 (#0)
* Trying ::1...
* Connection refused
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 9200 (#0)
> GET /_snapshot/SharedBackupRepo/_all?pretty HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:9200
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 24
<
{
"snapshots" : [ ]
}
You do not have the required permissions to view the files attached to this post.
Re: Nagios XI alert for Log Server backup failure
Well that's interesting. It seems the backups are in fact failing despite the backup script returning a successful message. I guess it wasn't explicitly stated, but I assume you don't see snapshots listed under Backup & Maintenance? It looks like we'll have to dig into this a bit and it may be best to do so through our ticketing system. If you'd like to send an email to customersupport@nagios.com I'd be glad to take the case. If you do open a ticket please provide the logs in /var/log/elasticsearch/*, /var/log/logstash/*, /var/log/httpd/*, and profiles gathered under Administration > System > System Status. Gather all this from all NLS servers.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Re: Nagios XI alert for Log Server backup failure
Will do.
Thanks!
Thanks!
Re: Nagios XI alert for Log Server backup failure
Got the ticket, going to close this up and we will continue there.
Former Nagios employee