WillemDH wrote:Ok, issue is back and I was able to get some logs. I'll PM them to jesse, as they contain some sensible info.
These look like simple timestamp parse failures - do these errors directly correspond with the detaching replica problem? If so, it's likely worth resolving the timestamp parser so that the issue might be resolved.
This kind of timestamp issue is almost always caused by the 'syslog' input not matching incoming logs properly. The resolution I like to use is replacing the 'syslog' input with two inputs - one bare tcp and one bare udp. After that you can assign the syslog filter yourself, like so:
Code: Select all
"match" => { "message" => "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}"
The 'SYSLOGTIMESTAMP' pattern is the one to be concerned about. Feel free to try TIMESTAMP here if you still experience parse failures. I'm not convinced that timestamp parsing is causing your replicas to detach, but if there's a correlation it's worth a try.
Otherwise, I'd like the following output:
Code: Select all
free -m
top | head -n5
curl 'localhost:9200/_cluster/health?level=indices&pretty'