No_Response from Remote Host
Re: No_Response from Remote Host
Yes Mr.Jdalrymple.
The outage happening is around for 4 minutes for mount services alone and the service is been checked every minute with an with max_check_attempt set as 3.
So what's happening here is after 3rd recheck attempt nagios is sending an alert... That is on 3rd minute.
Meanwhile on the 4th second the service gets recovered and again recovery notification is been sent to us. From which we can find on 4th minute the service is Ok. So as you suggested if i increase the max_check_attempt to 5, i may get a temporary solution as service is recovered on 4th minute.
Am i right?
Meanwhile let me find the cron status and process status as static fix!!
The outage happening is around for 4 minutes for mount services alone and the service is been checked every minute with an with max_check_attempt set as 3.
So what's happening here is after 3rd recheck attempt nagios is sending an alert... That is on 3rd minute.
Meanwhile on the 4th second the service gets recovered and again recovery notification is been sent to us. From which we can find on 4th minute the service is Ok. So as you suggested if i increase the max_check_attempt to 5, i may get a temporary solution as service is recovered on 4th minute.
Am i right?
Meanwhile let me find the cron status and process status as static fix!!
Cheers,
-Kriyeshh
-Kriyeshh
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: No_Response from Remote Host
YupKriyeshh wrote:So as you suggested if i increase the max_check_attempt to 5, i may get a temporary solution as service is recovered on 4th minute.
Am i right?
Re: No_Response from Remote Host
Thanks Mr.Jdalrymple.
Will make the changes and will come back along with the Process and Cron list.
Will make the changes and will come back along with the Process and Cron list.
Cheers,
-Kriyeshh
-Kriyeshh
Re: No_Response from Remote Host
Hi Friends,
I updated my cfg file and extended the max_check_attempts value to 5, which stopped my alert for time being.
But anyhow i have a service_out for 3 minutes as mentioned before.
And suggested by Mr.Tgriep i ran over my system processes and found that DATABASE_backup is been executed on that particular time.
The DB process is one of the essential process so that i cannot quit it, other hand i want to fix this service_out issue too.
Please suggest how to move forward.
I updated my cfg file and extended the max_check_attempts value to 5, which stopped my alert for time being.
But anyhow i have a service_out for 3 minutes as mentioned before.
And suggested by Mr.Tgriep i ran over my system processes and found that DATABASE_backup is been executed on that particular time.
The DB process is one of the essential process so that i cannot quit it, other hand i want to fix this service_out issue too.
Please suggest how to move forward.
Cheers,
-Kriyeshh
-Kriyeshh
Re: No_Response from Remote Host
At this point you have to talk to the manufacturer of the device to find out why it doesn't respond to SNMP polls while the backup is happening.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: No_Response from Remote Host
Mr.tgriep what if the infrastructure is cloud? Should i approach my Cloud service provider?
Cheers,
-Kriyeshh
-Kriyeshh
Re: No_Response from Remote Host
To be honest with you, I am not sure what else you can do. Have you tried determining what is the CPU & Memory utilization during these 3-5 minutes, while the database backup is running?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: No_Response from Remote Host
Hi Mr. lmiltchev,
For you information,I have already tried CPU and Memory utilization for the quoted downtime and its being already shared. To be open a database based backup process is running on these minutes, according to system process output summary.
Currently i have extended the max_check_attempt and i dint get any alerts but still have timeout in my logs.
Should i go for priority or something for Nagios service on that server?
For you information,I have already tried CPU and Memory utilization for the quoted downtime and its being already shared. To be open a database based backup process is running on these minutes, according to system process output summary.
Currently i have extended the max_check_attempt and i dint get any alerts but still have timeout in my logs.
Should i go for priority or something for Nagios service on that server?
Cheers,
-Kriyeshh
-Kriyeshh
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: No_Response from Remote Host
IMO this is a next best option.Kriyeshh wrote:Should i go for priority
Code: Select all
nice /path/to/database/backup/script.sh