Good morning,
I'm attempting to investigate an "IOException" error I saw in the snapshot that ran last night. I wanted to look at the jobs.log located here: /usr/local/nagioslogserver/var, but it's empty. The node I'm on is the 'master' according to the command I ran to check, so I assume there should be more data in the log file.
I checked the same file on the other nodes in my cluster and they are either empty or contain very few entries.
Does the jobs.log file get cleared out automatically? I would think it would just keep writing to itself. Is there a retention setting somewhere that I can adjust?
Thank you.
JOBS.log empty
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: JOBS.log empty
These get automatically cleared every minute, they are used specifically for seeing live jobs happening when tailing the log
You can adjust this behavior to always append to the log by editing /etc/cron.d/nagioslogserver
and changing this line
to this
However you would want something to clear this log file out periodically as it will then build forever.
You can adjust this behavior to always append to the log by editing /etc/cron.d/nagioslogserver
and changing this line
Code: Select all
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1Code: Select all
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs >> /usr/local/nagioslogserver/var/jobs.log 2>&1Re: JOBS.log empty
Ok, I don't mind leaving it at the default setting.
Is there anywhere else I can look for errors related to snapshots? I'm trying to figure out what causes this and how I might be able to prevent it:
Is there anywhere else I can look for errors related to snapshots? I'm trying to figure out what causes this and how I might be able to prevent it:
Code: Select all
{
"snapshot" : "curator-20191122063056",
"version_id" : 1070699,
"version" : "1.7.6",
"indices" : [ "logstash-2019.11.01", "logstash-2019.11.02", "logstash-2019.11.03", "logstash-2019.11.04", "logstash-2019.11.05", "logstash-2019.11.06", "logstash-2019.11.07", "logstash-2019.11.08", "logstash-2019.11.09", "logstash-2019.11.10", "logstash-2019.11.11", "logstash-2019.11.12", "logstash-2019.11.13", "logstash-2019.11.14", "logstash-2019.11.15", "logstash-2019.11.16", "logstash-2019.11.17", "logstash-2019.11.18", "logstash-2019.11.19", "logstash-2019.11.20", "logstash-2019.11.21" ],
"state" : "PARTIAL",
"start_time" : "2019-11-22T06:30:58.145Z",
"start_time_in_millis" : 1574404258145,
"end_time" : "2019-11-22T10:20:39.870Z",
"end_time_in_millis" : 1574418039870,
"duration_in_millis" : 13781725,
"failures" : [ {
"node_id" : "4iG-K95ISQavplerBeTl3A",
"index" : "logstash-2019.11.21",
"reason" : "IndexShardSnapshotFailedException[[logstash-2019.11.21][4] Failed to perform snapshot (index files)]; nested: IOException[Input/output error]; ",
"shard_id" : 4,
"status" : "INTERNAL_SERVER_ERROR"
}, {
"node_id" : "4iG-K95ISQavplerBeTl3A",
"index" : "logstash-2019.11.13",
"reason" : "IndexShardSnapshotFailedException[[logstash-2019.11.13][4] Failed to perform snapshot (index files)]; nested: IOException[Input/output error]; ",
"shard_id" : 4,
"status" : "INTERNAL_SERVER_ERROR"
} ],
"shards" : {
"total" : 105,
"failed" : 2,
"successful" : 103
}
} ]
}
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: JOBS.log empty
An Input/output error generally means for some reason at the time of the snapshot there was an error reading or writing to disk.
If a shard was missing or the disk failed to read these 23 shards it could be the cause, OR if it was not able to write to disk.
If you have a green cluster health, there would be no harm in forcing a new snapshot by going to Admin -> Command Subsystem
click the Run icon to the right of snapshots_maintenance
Snapshots are essentially diff's since the last snapshot, so additional snapshots of the same data take virtually no space.
If a shard was missing or the disk failed to read these 23 shards it could be the cause, OR if it was not able to write to disk.
If you have a green cluster health, there would be no harm in forcing a new snapshot by going to Admin -> Command Subsystem
click the Run icon to the right of snapshots_maintenance
Snapshots are essentially diff's since the last snapshot, so additional snapshots of the same data take virtually no space.
Re: JOBS.log empty
You can lock this, thank you.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: JOBS.log empty
great!rferebee wrote:You can lock this, thank you.
Locking thread