Hi,
I'm currently using the EC2 wizard v1.1.2 and it works fine to add new servers to be monitored.
The issue appear later, and it occurs randomly: suddenly some services are goes to unknown status with error "The check has received a response with no data. This is generally caused by an incorrect region name, invalid metric name, or invalid instance ID." then later they come back to "ok" and again later will come to "unknown" status.
What could cause this issue?
Thanks
EC2 monitoring unknown status
EC2 monitoring unknown status
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: EC2 monitoring unknown status
Hello @sgargano,
The fact that this is intermittent is typically the related to the network connection. Double check the AWS credentials, as mentioned in the post below the most common source of this error is an invalid region name.
Unable to fetch the details from ec2 instance
Next run the plugin check from the command line with the verbose option -v to display more data to help troubleshoot the error and post the results for us to review. Thanks.
Nagios XI - How To Test Check Commands From The Command-line
The fact that this is intermittent is typically the related to the network connection. Double check the AWS credentials, as mentioned in the post below the most common source of this error is an invalid region name.
Unable to fetch the details from ec2 instance
Next run the plugin check from the command line with the verbose option -v to display more data to help troubleshoot the error and post the results for us to review. Thanks.
Nagios XI - How To Test Check Commands From The Command-line
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: EC2 monitoring unknown status
The region name is definitely correct: "eu-central-1"
Very weird thing is that by shell it works but at the same time we have the issue under Nagios XI.
Very weird thing is that by shell it works but at the same time we have the issue under Nagios XI.
Code: Select all
################ AWS Response Data ################
Namespace: AWS/EC2
Instance ID: i-xxxxxxxxx
Metric Name: NetworkOut
Period: 300 seconds
Unit of measure: Bytes
Timestamp: 2019-10-04 06:52:00+00:00
################ Datapoints ################
Statistic: Average
Value: 862208.4
Warning Threshold: 1000000000
Critical Threshold: 2000000000
Return Code: 0
Statistic: Minimum
Value: 733503.0
Warning Threshold: 1000000000
Critical Threshold: 2000000000
Return Code: 0
Statistic: Maximum
Value: 1169452.0
Warning Threshold: 1000000000
Critical Threshold: 2000000000
Return Code: 0
Statistic: Sum
Value: 4311042.0
Warning Threshold: 1000000000
Critical Threshold: 2000000000
Return Code: 0
OK: Network Out Sum - 4311042.0 (Average: 862208.4B, Minimum: 733503.0B, Maximum: 1169452.0B) | NetworkOut=4311042.0B;1000000000;2000000000;;;
Re: EC2 monitoring unknown status
It seems like, these 3 metrics - CPUCreditBalance, NetworkPacketsIn, and NetworkPacketsOut gets updated on the Amazon side of things roughly every 5 min. So, the data won't be always available when you use "-P 5". Try increasing your period to 10.
Example:
Let us know if this helped.
Example:
Code: Select all
/usr/local/nagios/libexec/check_ec2.py -P 10 --metricname NetworkPacketsIn --instanceid 'xxx' --accesskeyid 'xxx' --secretaccesskey 'xxx' --region 'us-east-1' --warning '1750000' --critical '3500000'
OK: Network Packets In Sum - 64.0 (Average: 12.8, Minimum: 5.0, Maximum: 31.0) | NetworkPacketsIn=64.0;1750000;3500000;;;Be sure to check out our Knowledgebase for helpful articles and solutions!