Timeout error in logstash and inscrease log level logstash
Timeout error in logstash and inscrease log level logstash
Hi Team,
I send log to Nagios Log Server by filebeat.
Nagios Log Server is regularly timeout, do not receive log in some minute.
Here my filebeat log:
2021-01-26T15:50:26.532+0700 INFO [publisher] pipeline/retry.go:215 retryer: send wait signal to consumer
2021-01-26T15:50:26.532+0700 ERROR [logstash] logstash/async.go:280 Failed to publish events caused by: read tcp x.x.x.x:41432->x.x.x.x:5012: i/o timeout
2021-01-26T15:50:26.532+0700 INFO [publisher] pipeline/retry.go:219 done
2021-01-26T15:50:26.540+0700 ERROR [logstash] logstash/async.go:280 Failed to publish events caused by: client is not connected
2021-01-26T15:50:28.456+0700 ERROR [publisher_pipeline_output] pipeline/output.go:181 failed to publish events: client is not connected
My logstash input config:
beats {
type => 'test_beat'
port => 5000
client_inactivity_timeout => 86400
}
Finally, How to inscrease log level logstash in Nagios Log Server.
Thanks team.
I send log to Nagios Log Server by filebeat.
Nagios Log Server is regularly timeout, do not receive log in some minute.
Here my filebeat log:
2021-01-26T15:50:26.532+0700 INFO [publisher] pipeline/retry.go:215 retryer: send wait signal to consumer
2021-01-26T15:50:26.532+0700 ERROR [logstash] logstash/async.go:280 Failed to publish events caused by: read tcp x.x.x.x:41432->x.x.x.x:5012: i/o timeout
2021-01-26T15:50:26.532+0700 INFO [publisher] pipeline/retry.go:219 done
2021-01-26T15:50:26.540+0700 ERROR [logstash] logstash/async.go:280 Failed to publish events caused by: client is not connected
2021-01-26T15:50:28.456+0700 ERROR [publisher_pipeline_output] pipeline/output.go:181 failed to publish events: client is not connected
My logstash input config:
beats {
type => 'test_beat'
port => 5000
client_inactivity_timeout => 86400
}
Finally, How to inscrease log level logstash in Nagios Log Server.
Thanks team.
Re: Timeout error in logstash and inscrease log level logsta
After adding an input which defines a port, you need to make sure that the firewall allows that port. See https://assets.nagios.com/downloads/nag ... Inputs.pdf for commands to update the firewall.
Steps for enabling debug logging for logstash:
Edit /etc/init.d/logstash and change line 64 from:
to:
and restart the service with:
Let this run just long enough to allow NLS to process some events from this host and then collect the /var/log/logstash/logstash.log file before reverting the config back.
Steps for enabling debug logging for logstash:
Edit /etc/init.d/logstash and change line 64 from:
Code: Select all
DAEMON_OPTS="agent -f ${LS_CONF_DIR} -l ${LS_LOG_FILE} ${LS_OPTS}"
Code: Select all
DAEMON_OPTS="agent -f ${LS_CONF_DIR} -l ${LS_LOG_FILE} ${LS_OPTS} --debug"
Code: Select all
systemctl daemon-reload
systemctl restart logstash
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Timeout error in logstash and inscrease log level logsta
Hi cdienger,
I had stopped firewald but it still timeout.
Log in Logstash
{:timestamp=>"2021-01-27T13:01:34.189000+0700", :message=>"retrying failed action with response code: 503 (UnavailableShardsException[[logstash-2021.01.27][1] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@4223ff95])", :level=>:info}
{:timestamp=>"2021-01-27T13:01:34.189000+0700", :message=>"retrying failed action with response code: 503 (UnavailableShardsException[[logstash-2021.01.27][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@7e183bc1])", :level=>:info}
Log in Elasticsearch
[2021-01-27 13:01:34,150][DEBUG][action.bulk ] [7744e59b-c59e-4e67-923d-e763d7b4c2e8] observer: timeout notification from cluster service. timeout setting [1m], time since s
tart [1m]
[2021-01-27 13:01:34,150][DEBUG][action.bulk ] [7744e59b-c59e-4e67-923d-e763d7b4c2e8] observer: timeout notification from cluster service. timeout setting [1m], time since s
tart [1m]
Please help me fix this error.
Thanks cdienger
I had stopped firewald but it still timeout.
Log in Logstash
{:timestamp=>"2021-01-27T13:01:34.189000+0700", :message=>"retrying failed action with response code: 503 (UnavailableShardsException[[logstash-2021.01.27][1] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@4223ff95])", :level=>:info}
{:timestamp=>"2021-01-27T13:01:34.189000+0700", :message=>"retrying failed action with response code: 503 (UnavailableShardsException[[logstash-2021.01.27][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@7e183bc1])", :level=>:info}
Log in Elasticsearch
[2021-01-27 13:01:34,150][DEBUG][action.bulk ] [7744e59b-c59e-4e67-923d-e763d7b4c2e8] observer: timeout notification from cluster service. timeout setting [1m], time since s
tart [1m]
[2021-01-27 13:01:34,150][DEBUG][action.bulk ] [7744e59b-c59e-4e67-923d-e763d7b4c2e8] observer: timeout notification from cluster service. timeout setting [1m], time since s
tart [1m]
Please help me fix this error.
Thanks cdienger
Re: Timeout error in logstash and inscrease log level logsta
Please provide a profile from the system. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:
This will create /tmp/system-profile.tar.gz.
Note that this file can be very large and may not be able to be uploaded through the system. You can split the file into smaller files with the split command on the NLS(or other Linux machine) command line:
The above command will split the system-profile.tar.gz into 5MB segments and save them to files with the naming convention system-profile-nn.
Send this to me via a private message.
Code: Select all
/usr/local/nagioslogserver/scripts/profile.sh
Note that this file can be very large and may not be able to be uploaded through the system. You can split the file into smaller files with the split command on the NLS(or other Linux machine) command line:
Code: Select all
split -b 5000000 /tmp/system-profile.tar.gz system-profile- -d
Send this to me via a private message.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Timeout error in logstash and inscrease log level logsta
Hi cdienger,
My system have 4 instances with 2 instances in DC site and 2 instances in DR site.
So i will send 2 attach file in 2 site for you.
Please help me review my system, because timeout still appear everyday.
Thank you.
My system have 4 instances with 2 instances in DC site and 2 instances in DR site.
So i will send 2 attach file in 2 site for you.
Please help me review my system, because timeout still appear everyday.
Thank you.
Re: Timeout error in logstash and inscrease log level logsta
I received two profiles and would like to get profiles from the other two nodes as well.
At least one node appears to be having issues writing to the database quickly and is having to throttle indexing. Are the NLS machines using SSD or spinning disks?
At least one node appears to be having issues writing to the database quickly and is having to throttle indexing. Are the NLS machines using SSD or spinning disks?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Timeout error in logstash and inscrease log level logsta
Hi cdienger,
I use disk SAS 10000 rpm for 4 instances.
And i had just sent system profile on 4 node for you.
Please help me review its.
Thank you.
I use disk SAS 10000 rpm for 4 instances.
And i had just sent system profile on 4 node for you.
Please help me review its.
Thank you.
Re: Timeout error in logstash and inscrease log level logsta
There doesn't appear to be any of the throttling message in the most recent log, but we do recommend SSDs for the best performance.
The problems seem to start around roughly the same time each day - 4am. Are there system backups or other tasks running around this time? What does the frequency and next run time look like under Admin > System > Command Subsystem?
The problems seem to start around roughly the same time each day - 4am. Are there system backups or other tasks running around this time? What does the frequency and next run time look like under Admin > System > Command Subsystem?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Timeout error in logstash and inscrease log level logsta
Hi Cdienger,
The error appear several times a day, not only at 4am.
Now i remove 2 server DR site, timeout in client do not appear.
when i remove 2 server DR site, i clear all data but cluster status is red.
How can i make cluster status green again?
And how can i build DR site for "disaster recovery" without join DC site.
Thanks Cdienger.
The error appear several times a day, not only at 4am.
Now i remove 2 server DR site, timeout in client do not appear.
when i remove 2 server DR site, i clear all data but cluster status is red.
How can i make cluster status green again?
And how can i build DR site for "disaster recovery" without join DC site.
Thanks Cdienger.
You do not have the required permissions to view the files attached to this post.
Re: Timeout error in logstash and inscrease log level logsta
Is the current status of the DC site red as well or is it ok?
The red status will occur when there is a primary shard that is unassigned. The fix is to either remove the index with the missing shard or to reassign the shard. https://support.nagios.com/kb/article/n ... th-90.html has more information, but I'd like to get a profile from one of the machines in the DR cluster before you run anything.
The red status will occur when there is a primary shard that is unassigned. The fix is to either remove the index with the missing shard or to reassign the shard. https://support.nagios.com/kb/article/n ... th-90.html has more information, but I'd like to get a profile from one of the machines in the DR cluster before you run anything.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.