check_cloudwatch_status: CloudWatch Metric: CPUUtilization:

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
alp_support
Posts: 5
Joined: Fri Apr 24, 2015 3:31 am

check_cloudwatch_status: CloudWatch Metric: CPUUtilization:

Post by alp_support »

Hi Support,

We are using nagios check_cloudwatch_status.rb for monitoring the Cpu utilization of our AWS hosted instances. But since few days we are getting the alert: CloudWatch Metric: CPUUtilization: No AWS CloudWatch Datapoint retrieved for many instances.

Now in earlier topic: http://support.nagios.com/forum/viewtop ... =7&t=27414 it states that the instance under load will behave erratic and hence the alerts are received. But when i checked the instances load manually they seems to be fine.

Also NRPE agent to pull in Load data i.e "check_command check_nrpe!check_load" is working fine with no spike received howsoever

Kindly assist us in further troubleshooting

nagios version
[root@ip-10-0-200-5 tmp]# nagios -V

Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL

[root@ip-10-0-200-5 tmp]# nrpe -V

NRPE - Nagios Remote Plugin Executor
Copyright (c) 1999-2008 Ethan Galstad (nagios@nagios.org)
Version: 2.14
Last Modified: 12-21-2012
License: GPL v2 with exemptions (-l for more info)
SSL/TLS Available: Anonymous DH Mode, OpenSSL 0.9.6 or higher required
TCP Wrappers Available
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by tmcdonald »

Not to deflect, but have you contacted the author of that plugin? He or she would likely be better able to assist, as anything we do would be somewhat guesswork (since it is not one of our standard plugins).
Former Nagios employee
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by ssax »

Please try running the command manually from the command line and use the --verbose option and post the sanitized output.
alp_support
Posts: 5
Joined: Fri Apr 24, 2015 3:31 am

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by alp_support »

I have raised the query to Author as well but still waiting on for a reply
tmcdonald wrote:Not to deflect, but have you contacted the author of that plugin? He or she would likely be better able to assist, as anything we do would be somewhat guesswork (since it is not one of our standard plugins).
alp_support
Posts: 5
Joined: Fri Apr 24, 2015 3:31 am

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by alp_support »

ssax wrote:Please try running the command manually from the command line and use the --verbose option and post the sanitized output.
I'm removing the customer related information from the verbose output like IP, Customer name, ID is abstracted and replaced with xxxx

[root@ip-10-0-200-5 tmp]# /usr/lib64/nagios/plugins/check_cloudwatch_status.rb -a eu-west-1 -i i-xxxxx -f /etc/facter/facts.d/cloudwatch.txt -C "CPUUtilization" --warning 60 --critical 70 --verbose
** Launching AWS status retrieval on instance ID: i-xxxx
Amazon AWS Endpoint: EC2 ec2.eu-west-1.amazonaws.com, RDS rds.eu-west-1.amazonaws.com, ELB elasticloadbalancing.eu-west-1.amazonaws.com
Amazon CloudWatch Endpoint: monitoring.eu-west-1.amazonaws.com
Warning values: [0, 60.0]
Critical values: [0, 70.0]
AWS EC2 Instance:
{"requestId"=>"84932634-e9a2-4f10-af84-7398b7b7577c",
"xmlns"=>"http://ec2.amazonaws.com/doc/2010-08-31/",
"reservationSet"=>
{"item"=>
[{"instancesSet"=>
{"item"=>
[{"rootDeviceType"=>"ebs",
"architecture"=>"x86_64",
"tagSet"=>
{"item"=>
[{"value"=>"asgapplication",
"key"=>"aws:cloudformation:logical-id"},
{"value"=>
"arn:aws:cloudformation:eu-west-1:xxxxx:stack/xxxxx/d4fb2980-0e50-11e4-b6c7-50fa18c86ab4",
"key"=>"aws:cloudformation:stack-id"},
{"value"=>"app", "key"=>"Name"},
{"value"=>"xxxx", "key"=>"environment_name"},
{"value"=>"xxxx",
"key"=>"aws:cloudformation:stack-name"},
{"value"=>"xx@xx.com", "key"=>"requested_by"},
{"value"=>"ip-10-8-10-240", "key"=>"LaunchedFrom"},
{"value"=>"xx-xx", "key"=>"jenkins_user"},
{"value"=>"xxxx",
"key"=>"aws:autoscaling:groupName"},
{"value"=>"bronze-vpc", "key"=>"environment_type"}]},
"launchTime"=>"2014-07-18T07:57:25.000Z",
"privateDnsName"=>"ip-xx.eu-west-1.compute.internal",
"dnsName"=>nil,
"kernelId"=>"aki-71665e05",
"instanceType"=>"m3.xlarge",
"reason"=>nil,
"virtualizationType"=>"paravirtual",
"rootDeviceName"=>"/dev/sda1",
"vpcId"=>"vpc-xxx",
"placement"=>{"availabilityZone"=>"eu-west-1a", "groupName"=>nil},
"imageId"=>"ami-xxx",
"monitoring"=>{"state"=>"enabled"},
"productCodes"=>nil,
"instanceId"=>"i-xxxx",
"clientToken"=>"c33d61ec-1937-4c3b-837c-c897aea72409_eu-west-1a_2",
"amiLaunchIndex"=>"1",
"privateIpAddress"=>"xx.xx.xx.xx",
"instanceState"=>{"code"=>"16", "name"=>"running"},
"blockDeviceMapping"=>
{"item"=>
[{"deviceName"=>"/dev/sda",
"ebs"=>
{"status"=>"attached",
"volumeId"=>"vol-xxxx",
"deleteOnTermination"=>"true",
"attachTime"=>"2014-07-18T07:57:27.000Z"}}]},
"subnetId"=>"subnet-xxxx",
"keyName"=>"xxxx"}]},
"reservationId"=>"r-xxxx",
"requesterId"=>"xxxx",
"ownerId"=>"xxxxxx",
"groupSet"=>nil}]}}
CloudWatch Detailed Monitoring is enabled for Instance i-xxxx
CloudWatch:
#<AWS::Cloudwatch::Base:0x7f2d3e4eb598
@access_key_id="xxxx",
@http=#<Net::HTTP monitoring.eu-west-1.amazonaws.com:443 open=false>,
@path="/",
@port=443,
@proxy_server=nil,
@secret_access_key="xxxxxx",
@server="monitoring.eu-west-1.amazonaws.com",
@use_ssl=true>
CloudWatch Metrics Statistics:
{"xmlns"=>"http://monitoring.amazonaws.com/doc/2010-08-01/",
"GetMetricStatisticsResult"=>{"Label"=>"CPUUtilization", "Datapoints"=>nil},
"ResponseMetadata"=>{"RequestId"=>"c59a8f57-eb27-11e4-af8c-4bdfc7697f44"}}
CloudWatch Metric: CPUUtilization: No AWS CloudWatch Datapoint retrieved|
[root@ip-10-0-200-5 tmp]#
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by jdalrymple »

Please reference this other topic on the support forums.

I'm not totally clear on whether the problem was because this guy's Nagios AWS instance was heavily loaded or if it was the machines he was querying. I'm almost certain it must have been the Nagios instance because I don't know why a heavily loaded monitored instance would have any affect on the APIs ability to return results to you.

What does the load look like on your Nagios box? Is it also an AWS instance?
alp_support
Posts: 5
Joined: Fri Apr 24, 2015 3:31 am

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by alp_support »

jdalrymple wrote:Please reference this other topic on the support forums.

I'm not totally clear on whether the problem was because this guy's Nagios AWS instance was heavily loaded or if it was the machines he was querying. I'm almost certain it must have been the Nagios instance because I don't know why a heavily loaded monitored instance would have any affect on the APIs ability to return results to you.

What does the load look like on your Nagios box? Is it also an AWS instance?

Hi

My nagios box is also an AWS instance. The load on the box is sometimes high but not always. I have the polling interval of 5 mins and during the timespan of 30 mins which is monitored i have checked the load and its seems to be normal.

Any other ideas ?
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by jdalrymple »

Would it be possible to get the verbose output of one that's not failing, they'd be interesting to compare. The error seems legit to me. I don't see anything in the verbose output that I'd regard as a CPU statistic. Are the failures more related to a type of service, or to a host?
alp_support
Posts: 5
Joined: Fri Apr 24, 2015 3:31 am

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by alp_support »

Hi jdalrymple,

I couldnt get a verbose of the working ones as all are failing for my nagios. Since we are monitoring only CPU Utilization i cannot comment on other metrics.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: check_cloudwatch_status: CloudWatch Metric: CPUUtilizati

Post by jdalrymple »

Can you debug the metrics a bit from the AWS side of things to verify they're actually there?

https://console.aws.amazon.com/cloudwatch/

Click the Metrics button on the left and find the metric you're trying to monitor.
Locked