Exit code - Python script

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
amitdaniel
Posts: 25
Joined: Mon Jul 01, 2013 9:17 am

Exit code - Python script

Post by amitdaniel »

Hey ,

I written a Python script that connect to the CloudWatch (AWS) API ,
And do some nice stuff .

When i run this script in Nagios i get this error in the UI : (No output on stdout) stderr: Traceback (most recent call last):

In the script i define the exit codes like that :

UNKNOWN = -1
OK = 0
WARNING = 1
CRITICAL = 2

And when i leave the script i enter those commands :

Print "OK"
sys.exit(0)

OR

print "CRITICAL"
sys.exit(2)

Can you help me to understand what is this error ?


Thanks to all !!!
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Exit code - Python script

Post by slansing »

Can we see your entire script? Do you know where your script is bailing out? ie. what line?
amitdaniel
Posts: 25
Joined: Mon Jul 01, 2013 9:17 am

Re: Exit code - Python script

Post by amitdaniel »

Sure this is the full script.

#!/usr/bin/env python

# Exit statuses recognized by Nagios
UNKNOWN = -1
OK = 0
WARNING = 1
CRITICAL = 2



# Import the SDK
import boto
import uuid
import boto.ec2.cloudwatch
import datetime
import os
import sys
import time
import itertools
def main():

file = open('/tmp/out.txt', 'w')
# access keys in these environment variables:


AWS_ACCESS_KEY_ID= 'XXXXXXXXXXXXXXXXXXXXXXXX'
AWS_SECRET_ACCESS_KEY= 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'



# This sets up the connection information to CloudWatch :

Cloudwatch_Connection_Virginia = boto.ec2.cloudwatch.connect_to_region ("us-east-1")
Cloudwatch_Connection_Europe = boto.ec2.cloudwatch.connect_to_region ("eu-west-1")

# Take all the Metrics list in the Region :

Metrics_Virginia = Cloudwatch_Connection_Virginia.list_metrics()
Metrics_Europe = Cloudwatch_Connection_Europe.list_metrics()

# Define the date for now and 1 hour before

Current_Time = datetime.datetime.now()
Last_Hour_Time = Current_Time - datetime.timedelta(hours=1)


# Get all Alarms in ALARM State !

Alarm_Virginia = Cloudwatch_Connection_Virginia.describe_alarms(None, None, None, None, "ALARM", None)
Alarm_Europe = Cloudwatch_Connection_Europe.describe_alarms(None, None, None, None, "ALARM", None)

# Check if there are Errors If NO -> print OK and exit with OK status code.

if len (Alarm_Virginia) == 0 and len (Alarm_Europe) == 0:

file.write( "OK - No Alarms in Virginia & Europe Regions , Everything is Fine !!")

sys.exit(0)

# Initialize error variable :

error = ""

# Start the FOR Loop to get all alarms in EU & Virginia Regions :

for alarm_v in Alarm_Virginia:
error = error + alarm_v.name + " ,"

for alarm_e in Alarm_Europe:
error = error + alarm_e.name + " ,"

file.write ( "CRITICAL - There is Alarms in Virginia & Europe Regions Please check and fix them => " + error)
sys.exit(2)

file.close()
if __name__ == "__main__":
main()

Please let me know if you need more info

Thanks a lot !!!
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Exit code - Python script

Post by tmcdonald »

In your first post you said you are printing OK, but in the script you posted there is no print statement. The check will fail in Nagios if there is no stdout output.

https://nagios-plugins.org/doc/guidelines.html
Former Nagios employee
amitdaniel
Posts: 25
Joined: Mon Jul 01, 2013 9:17 am

Re: Exit code - Python script

Post by amitdaniel »

My Mistake

this is the updated script with the prints :

Code: Select all

#!/usr/bin/env python

# Exit statuses recognized by Nagios
UNKNOWN = -1
OK = 0
WARNING = 1
CRITICAL = 2



# Import the SDK
import boto
import uuid
import boto.ec2.cloudwatch
import datetime
import os
import sys
import time
import itertools
def main():


# access keys in these environment variables:

        AWS_ACCESS_KEY_ID= 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
        AWS_SECRET_ACCESS_KEY= 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'



        # This sets up the connection information to CloudWatch :

        Cloudwatch_Connection_Virginia = boto.ec2.cloudwatch.connect_to_region ("us-east-1")
        Cloudwatch_Connection_Europe = boto.ec2.cloudwatch.connect_to_region ("eu-west-1")

        # Take all the Metrics list in the Region :

        Metrics_Virginia = Cloudwatch_Connection_Virginia.list_metrics()
        Metrics_Europe = Cloudwatch_Connection_Europe.list_metrics()

        # Define the date for now and 1 hour before

        Current_Time = datetime.datetime.now()
        Last_Hour_Time = Current_Time - datetime.timedelta(hours=1)

        # Get all Alarms in ALARM State !

        Alarm_Virginia = Cloudwatch_Connection_Virginia.describe_alarms(None, None, None, None, "ALARM", None)
        Alarm_Europe =  Cloudwatch_Connection_Europe.describe_alarms(None, None, None, None, "ALARM", None)

        # Check if there are Errors If NO -> print OK and exit with OK status code.

        if len (Alarm_Virginia) == 0 and  len (Alarm_Europe) == 0:

                print "OK - No Alarms in Virginia & Europe Regions , Everything is Fine !!"

                sys.exit(0)

        # Initialize error variable :

        error = ""

        # Start the FOR Loop to get all alarms in EU & Virginia Regions :

        for alarm_v in Alarm_Virginia:
                error = error + alarm_v.name + " ,"

        for alarm_e in Alarm_Europe:
                error = error + alarm_e.name + " ,"

        print "CRITICAL - There is Alarms in Virginia & Europe Regions Please check and fix them => " +  error
        sys.exit(2)

if __name__ == "__main__":
    main()
Mod note - Please use code tags around code as I have done to your post
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Exit code - Python script

Post by tmcdonald »

Can you show us what happens when you run the check from the command line as the nagios user?

Can you try replacing "#!/usr/bin/env python" with "#!/usr/bin/python" or wherever your python binary is located?

Also, UNKNOWN status is not -1, it is 3 in nagios.
Former Nagios employee
amitdaniel
Posts: 25
Joined: Mon Jul 01, 2013 9:17 am

Re: Exit code - Python script

Post by amitdaniel »

Sure :-)

This is the output

[nagios@nagios libexec]$ python check_cloud_watch.py
CRITICAL - There is Alarms in Virginia & Europe Regions Please check and fix them => Onetag_V2_Latency ,

I changed the #/usr/bin/python
and changed the exit code unknown to 3
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Exit code - Python script

Post by tmcdonald »

And are you still getting that error in the Core interface?
Former Nagios employee
amitdaniel
Posts: 25
Joined: Mon Jul 01, 2013 9:17 am

Re: Exit code - Python script

Post by amitdaniel »

Yes ,

This is the error that i see in the UI :

Status warning :

(No output on stdout) stderr: Traceback (most recent call last):
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Exit code - Python script

Post by sreinhardt »

How do you have your command defined within nagios to exectute this script? Considering you have a shebang line, you shouldn't need to specify "python check_cloud_watch.py" instead try "./check_cloud_watch.py". Of course this will mean that your script will need to be executable if it is not already.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Locked