Page 1 of 1
Different check results depending how invoked
Posted: Thu Feb 13, 2020 11:01 am
by JGCG
Hi,
I'm currently creating a plugin which will allow us to monitor various services in AWS using the AWS SDK.
The SDK automatically loads the AWS access keys and configuration values from files in the ~/.aws/config and ~/.aws/credentials files.
Executing the check from the command line as the Nagios user work fine.
Executing the 'test check' when setting up the service works fine.
However, when the check is set up and is ran by Nagios, the check fails as it is unable to load the configuration:
"MissingRegion: could not find region configuration"
Does something different happen when the check is ran by Nagios as opposed to via the command line as the Nagios user/or when running the test check?
Thanks.
Re: Different check results depending how invoked
Posted: Thu Feb 13, 2020 12:20 pm
by jdunitz
You said that you ran it as the nagios user, but I'd like to make sure you became nagios correctly.
Did you do "su - nagios" or "su nagios"? If you omit the -, your environment won't be pulled in correctly, which might cause this problem.
Another possibility is that your plugin creates temp files somewhere, and if you tested it as root, there may be leftover files owned by root rather than nagios, and those might be getting in the way. So, you might track those down and clean them up if necessary.
Let us know what you find!
--Jeffrey
Re: Different check results depending how invoked
Posted: Thu Feb 13, 2020 1:30 pm
by JGCG
jdunitz wrote:You said that you ran it as the nagios user, but I'd like to make sure you became nagios correctly.
Did you do "su - nagios" or "su nagios"? If you omit the -, your environment won't be pulled in correctly, which might cause this problem.
Another possibility is that your plugin creates temp files somewhere, and if you tested it as root, there may be leftover files owned by root rather than nagios, and those might be getting in the way. So, you might track those down and clean them up if necessary.
Let us know what you find!
--Jeffrey
I used "su - nagios".
I think I've found the cause, it looks as if the shorthand method for home using the tidle (~) isn't being expanded when Nagios invokes the check, but does from the service configuration page.
I've amended the code to manually load the credentials from the file based on an argument passed in.
When passing in '~/.aws/credentials', it works fine when testing, but not when Nagios call the check - it states it can't find the file.
Passing in the absolute path instead works.
I'm not quoting the argument when it is defined in the service configuration page or in the command.
Please see the attached image.
I'm still not sure why this would cause an issue with the SDK though as it will automatically load the file from the users home directory (the SDK doesn't allow you to specify the file to use, hence why I had to amend my code above to manually load the values from the file and pass them in as variables), but it seems to be related to the issue above.
Re: Different check results depending how invoked
Posted: Thu Feb 13, 2020 4:53 pm
by lmiltchev
I'm currently creating a plugin which will allow us to monitor various services in AWS using the AWS SDK.
This is usually not supported as it is a "custom" plugin. If using the absolute path works for you, we could close this topic and mark it as resolved. If you want us to keep digging into it, and help you figure out why using "~" does not work for you, we could try but cannot guarantee a solution. We would need to know/see:
1. OS/distro/architecture of your system
2. Your custom plugin
3. Service and command definitions
4. The output of the following commands:
Code: Select all
grep nagios /etc/passwd
grep nag /etc/group
ls -lad /home/nagios /home/nagios/.aws
P.S.
I tried using "shorhand" path with check_ec2.py and it worked just fine for me:
Example:
Code: Select all
define service {
host_name Amazon
service_description CPU Credit Balance_copy_1
use xiwizard_linuxserver_ping_service
check_command check_ec2!-P 10 --metricname CPUCreditBalance --instanceid 'i-xxx' --configfile=~/.aws/config --credfile=~/.aws/credentials --warning '100' --critical '25'!!!!!!!
max_check_attempts 5
check_interval 5
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 60
notification_period xi_timeperiod_24x7
notifications_enabled 1
contacts nagiosadmin
_xiwizard ec2
register 1
}
example01.PNG
example02.PNG
example03.PNG
Re: Different check results depending how invoked
Posted: Thu Feb 13, 2020 5:23 pm
by mbellerue
I've amended the code to manually load the credentials from the file based on an argument passed in.
When passing in '~/.aws/credentials', it works fine when testing, but not when Nagios call the check - it states it can't find the file.
Passing in the absolute path instead works.
So on a standard install, the nagios user actually doesn't have a home directory. Given that it works when you test manually, I bet the home directory is set in /etc/passwd, but it may be worth a look just to be sure. The other thing is that if the nagios service hasn't been restarted since you've created the home directory, is that instance aware that there is a home directory?
Re: Different check results depending how invoked
Posted: Thu Feb 13, 2020 6:09 pm
by JGCG
Thanks for the reply both. The home directory does exist - we're using the pre-built virtual appliance.
I'll do a bit more digging tomorrow to see if I can work this out, but for now I'm happy with the workaround I have in place and so the call can be closed if you like.
Thanks for your help.
Re: Different check results depending how invoked
Posted: Fri Feb 14, 2020 8:17 am
by scottwilkerson
JGCG wrote:Thanks for the reply both. The home directory does exist - we're using the pre-built virtual appliance.
I'll do a bit more digging tomorrow to see if I can work this out, but for now I'm happy with the workaround I have in place and so the call can be closed if you like.
Thanks for your help.
Locking thread