Hi Everyone,
I'm currently using the Check_AWS_CloudWatch_metrics plugin, found here: http://exchange.nagios.org/directory/Pl ... cs/details to monitor some Amazon instances.
Most of the time it works fine, but repeatedly throughout the day I get alerts from nagios along the lines of: 'Cloudwatch Metric: CPUUtilization: No AWS CloudWatch Datapoint retrieved'. Is anyone familiar with how the code works and why I am getting this error? I'm not sure whether it's a problem with my Nagios config or whether it would be on Amazon's end.
Thanks in Advance
Why Am I getting No AWS CloudWatch Datapoint Retrieved
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Why Am I getting No AWS CloudWatch Datapoint Retrieved
I don't believe we have the means to test this in house, however, can you show us the configuration files you have for these service checks? Do you have several of these checks set up for different metrics? Are they all showing the same issue at the same time?
Re: Why Am I getting No AWS CloudWatch Datapoint Retrieved
I do have several checks that use the plug-in, and they tend to show this issue on the same host around the same time.
define service {
use generic-service
service_description Amazon API EC2 - CPU Utilization
servicegroups API-status, API-status-EC2
hosts as.ap1.site.com
check_command check_cloudwatch_ec2!CPUUtilization!80!90
max_check_attempts 3
}
They are similar to this one above, where the command is
# Cloudwatch EC2 Monitor
define command {
command_name check_cloudwatch_ec2
command_line /usr/local/nagios/libexec/ec2-api/check_cloudwatch_status.rb -a $_HOSTPUBLICDNS$ -i $_HOSTINSTANCEID$ -f "/usr/local/nagios/libexec/ec2-api/ec2_credentials.cfg" -C $ARG1$ --warning $ARG2$ --critical $ARG3$
}
When looking into the actual check_cloudwatch_status.rb file, it looks like it only returns null when the datapoint set it gets from the amazon API is null. My Nagios instance is itself running under a heavy load (15,16,18) at the moment because its an amazon m1.small instance; could that be affecting it's ability to run properly?
define service {
use generic-service
service_description Amazon API EC2 - CPU Utilization
servicegroups API-status, API-status-EC2
hosts as.ap1.site.com
check_command check_cloudwatch_ec2!CPUUtilization!80!90
max_check_attempts 3
}
They are similar to this one above, where the command is
# Cloudwatch EC2 Monitor
define command {
command_name check_cloudwatch_ec2
command_line /usr/local/nagios/libexec/ec2-api/check_cloudwatch_status.rb -a $_HOSTPUBLICDNS$ -i $_HOSTINSTANCEID$ -f "/usr/local/nagios/libexec/ec2-api/ec2_credentials.cfg" -C $ARG1$ --warning $ARG2$ --critical $ARG3$
}
When looking into the actual check_cloudwatch_status.rb file, it looks like it only returns null when the datapoint set it gets from the amazon API is null. My Nagios instance is itself running under a heavy load (15,16,18) at the moment because its an amazon m1.small instance; could that be affecting it's ability to run properly?
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Why Am I getting No AWS CloudWatch Datapoint Retrieved
Depending on what the plugin is querying it could definitely have an effect, is there a test instance where you can make sure the load is relatively low then run your checks against it to see if they drop when you intentionally spike the load? I would think this is something on the AWS side as it is intermittent.
Re: Why Am I getting No AWS CloudWatch Datapoint Retrieved
That seemed to be the issue: on a system with lower load it was a non-issue and I started to see the problems as the load increased.
Thank you for the help!
Thank you for the help!
Re: Why Am I getting No AWS CloudWatch Datapoint Retrieved
I'm glad your issue has been resolved! I am locking this topic. If you have any other questions/issues, please, start a new thread.
Be sure to check out our Knowledgebase for helpful articles and solutions!