Page 1 of 1
Monitor Status Check failed Status AWS
Posted: Thu Jun 21, 2018 2:25 pm
by ss6407
Hi Team,
We monitor AWS status check failed on nagios.
Till last week it was working fine.
Now the service check fails with below kind of error
"UNKNOWN - Name=InstanceId,Value=i-018c183920bf5a804 StatusCheckFailed (5 min Average): null null - No metric value known."
For some of the service check, if I do forceful check , it runs fine. But not for all the hosts.
Please advise.
Thanks!
Re: Monitor Status Check failed Status AWS
Posted: Thu Jun 21, 2018 2:35 pm
by scottwilkerson
In order to have any sort of idea what type of check this is running I would need to see the plugin you are using, as well as the command if you were to run it from the command line as it is not something builtin to XI.
Re: Monitor Status Check failed Status AWS
Posted: Thu Jun 21, 2018 3:21 pm
by ss6407
Hi Team,
Attached plugin that we use for AWS.
Also sometime i see below prompt on service check.
"Notifications for this service are being suppressed because it was detected as having been flapping between different states (23.2% change >= 20.0% threshold). When the service state stabilizes and the flapping stops, notifications will be re-enabled."
Note - We are using only AWS Status Check failed service check currently.
Thanks!
Re: Monitor Status Check failed Status AWS
Posted: Thu Jun 21, 2018 3:29 pm
by scottwilkerson
scottwilkerson wrote:In order to have any sort of idea what type of check this is running I would need to see the plugin you are using, as well as the command if you were to run it from the command line as it is not something builtin to XI.
I will also need the command line your Nagios Server uses to run this plugin
Re: Monitor Status Check failed Status AWS
Posted: Fri Jun 22, 2018 10:32 am
by ss6407
Hi Team,
Below is the command used to run the plugin
$USER1$/check_cloudwatch.sh --region=us-east-1 --namespace="$ARG1$" --metric="$ARG2$" --statistics="Average" --mins=$ARG7$ --dimensions="Name=$ARG3$,Value=$HOSTALIAS$" --warning=$ARG5$ --critical=$ARG6$ --profile=$ARG8$
$ARG1$
EC2
$ARG2$
StatusCheckFailed
$ARG3$
InstanceId
$ARG4$
0
$ARG5$
1
$ARG6$
1
$ARG7$
5
$ARG8$
cisProd
Re: Monitor Status Check failed Status AWS
Posted: Fri Jun 22, 2018 11:22 am
by scottwilkerson
Based on my cursory knowledge of AWS metrics, this appears right.
You said it works if you force the check, you may want to verify with Amazon or the plugin author why sometimes they would not return data
Re: Monitor Status Check failed Status AWS
Posted: Fri Jun 22, 2018 1:38 pm
by ss6407
Hi,
This was working fine till last week.
Suddenly it started giving the error.
This is specifically seen when we do apply config.
Re: Monitor Status Check failed Status AWS
Posted: Fri Jun 22, 2018 1:45 pm
by scottwilkerson
ss6407 wrote:Hi,
This was working fine till last week.
Suddenly it started giving the error.
This is specifically seen when we do apply config.
With a sudden change, I would suspect something maybe changed with the AWS API that it connects to.
That is the only thing that is a variable.