Page 2 of 2

Re: No_Response from Remote Host

Posted: Mon Aug 03, 2015 3:48 pm
by Kriyeshh
Yes Mr.Jdalrymple.

The outage happening is around for 4 minutes for mount services alone and the service is been checked every minute with an with max_check_attempt set as 3.
So what's happening here is after 3rd recheck attempt nagios is sending an alert... That is on 3rd minute.

Meanwhile on the 4th second the service gets recovered and again recovery notification is been sent to us. From which we can find on 4th minute the service is Ok. So as you suggested if i increase the max_check_attempt to 5, i may get a temporary solution as service is recovered on 4th minute.

Am i right?

Meanwhile let me find the cron status and process status as static fix!!

Re: No_Response from Remote Host

Posted: Mon Aug 03, 2015 4:32 pm
by jdalrymple
Kriyeshh wrote:So as you suggested if i increase the max_check_attempt to 5, i may get a temporary solution as service is recovered on 4th minute.

Am i right?
Yup

Re: No_Response from Remote Host

Posted: Mon Aug 03, 2015 4:42 pm
by Kriyeshh
Thanks Mr.Jdalrymple.
Will make the changes and will come back along with the Process and Cron list.

Re: No_Response from Remote Host

Posted: Tue Aug 04, 2015 4:51 pm
by hsmith
Thanks, keep us posted.

Re: No_Response from Remote Host

Posted: Thu Aug 06, 2015 5:14 pm
by Kriyeshh
Hi Friends,

I updated my cfg file and extended the max_check_attempts value to 5, which stopped my alert for time being.
But anyhow i have a service_out for 3 minutes as mentioned before.
And suggested by Mr.Tgriep i ran over my system processes and found that DATABASE_backup is been executed on that particular time.
The DB process is one of the essential process so that i cannot quit it, other hand i want to fix this service_out issue too.
Please suggest how to move forward.

Re: No_Response from Remote Host

Posted: Fri Aug 07, 2015 11:04 am
by tgriep
At this point you have to talk to the manufacturer of the device to find out why it doesn't respond to SNMP polls while the backup is happening.

Re: No_Response from Remote Host

Posted: Fri Aug 07, 2015 11:51 am
by Kriyeshh
Mr.tgriep what if the infrastructure is cloud? Should i approach my Cloud service provider?

Re: No_Response from Remote Host

Posted: Fri Aug 07, 2015 12:29 pm
by lmiltchev
To be honest with you, I am not sure what else you can do. Have you tried determining what is the CPU & Memory utilization during these 3-5 minutes, while the database backup is running?

Re: No_Response from Remote Host

Posted: Tue Aug 11, 2015 2:57 pm
by Kriyeshh
Hi Mr. lmiltchev,
For you information,I have already tried CPU and Memory utilization for the quoted downtime and its being already shared. To be open a database based backup process is running on these minutes, according to system process output summary.

Currently i have extended the max_check_attempt and i dint get any alerts but still have timeout in my logs.
Should i go for priority or something for Nagios service on that server?

Re: No_Response from Remote Host

Posted: Tue Aug 11, 2015 3:17 pm
by jdalrymple
Kriyeshh wrote:Should i go for priority
IMO this is a next best option.

Code: Select all

nice /path/to/database/backup/script.sh
should be adequate.