Nagios XI alert for Log Server backup failure

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Nagios XI alert for Log Server backup failure

Post by james.liew »

Hi Support,

Am not quite sure where to post this but it looks like my Nagios XI is showing some backup snapshot alerts for my Log Server cluster.

Strange this is that I'm looking at Nagios Log Server and the backup snapshot was run successfully. This is occurring across two datacentres.

Where do I start to troubleshoot on this?


***** Nagios XI Alert *****

Nagios has detected a problem with this service.

Notification Type: PROBLEM

Service: Last NLS Backup
Host: hs3-nagcluster
Address: *ip removed*
State: UNKNOWN
Info:
UNKNOWN: Unable to determine result within last 25 hours: No snapshots found in Elasticsearch.
Date/Time: 21/06/2017 13:10:29
You do not have the required permissions to view the files attached to this post.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagios XI alert for Log Server backup failure

Post by cdienger »

Run the following on the NLS:

Code: Select all

curator --loglevel warn show snapshots --repository REPONAME --newer-than 25 --time-unit hours
curl -v -XGET "http://localhost:9200/_snapshot/REPONAME/_all?pretty"
REPONAME is the name of the repository as seen under Administration > System > Backup & Maintenance > Repositories. The first command is the command that the check runs and may give us more information as to why it is unable to detect a current snapshot. The second command will show us all available snapshots.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Re: Nagios XI alert for Log Server backup failure

Post by james.liew »

Output as below:
repo.png
root@hs3-log-01 ~]# curator --loglevel warn show snapshots --repository SharedBackupRepo --newer-than 25 --time-unit hours
2017-06-29 11:20:16,516 ERROR No snapshots found in Elasticsearch.
No snapshots found in Elasticsearch.
[root@hs3-log-01 ~]#
[root@hs3-log-01 ~]# curl -v -XGET "http://localhost:9200/_snapshot/SharedB ... all?pretty"
* About to connect() to localhost port 9200 (#0)
* Trying ::1...
* Connection refused
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 9200 (#0)
> GET /_snapshot/SharedBackupRepo/_all?pretty HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:9200
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 24
<
{
"snapshots" : [ ]
}
You do not have the required permissions to view the files attached to this post.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagios XI alert for Log Server backup failure

Post by cdienger »

Well that's interesting. It seems the backups are in fact failing despite the backup script returning a successful message. I guess it wasn't explicitly stated, but I assume you don't see snapshots listed under Backup & Maintenance? It looks like we'll have to dig into this a bit and it may be best to do so through our ticketing system. If you'd like to send an email to customersupport@nagios.com I'd be glad to take the case. If you do open a ticket please provide the logs in /var/log/elasticsearch/*, /var/log/logstash/*, /var/log/httpd/*, and profiles gathered under Administration > System > System Status. Gather all this from all NLS servers.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Re: Nagios XI alert for Log Server backup failure

Post by james.liew »

Will do.

Thanks!
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Nagios XI alert for Log Server backup failure

Post by tmcdonald »

Got the ticket, going to close this up and we will continue there.
Former Nagios employee
Locked