Service management/Initial State

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
Elcom
Posts: 15
Joined: Wed Jul 15, 2020 3:15 am

Service management/Initial State

Post by Elcom »

Hi, I am running a powershell script which stops and starts a service, it stops a service, waits a number of seconds then restarts the service and sends an email as below

Service enters critical state

Calls this command
$USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename myservice -waittorestart 5"

Which in turn calls myscript.ps1

stop-Service $servicename
start-sleep $waittorestart
start-Service $servicename
postie -host:my host -to:my to email address -from:my from email address -s:"My subject" -msg:"my message here"

My issue is when I call this script it seems to run and restart the service when the check enters critical and then runs again when it changes back to OK from critical

Is there any way to configure my service perhaps using "initial state" in Service Management? I want this script to only run once at critical and not run again after the next check interval when it is in the OK state.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Service management/Initial State

Post by vtrac »

Hi Elcom,
Based on what you have setup, the "check_ncpa.py" script will runs your "myscript.ps1" script no matter what the status because there is no defined condition for the "check_ncpa.py" to uses.

You might want to put the condition inside your "myscript.ps1" as to when to restart services.

As to calling the "check_ncpa.py", I believe "args" should be separated by comma like this:

Code: Select all

check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename myservice,args=-waittorestart 5"
or with quote:

Code: Select all

check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename 'myservice',args=-waittorestart '5'"
Regards,
Vinh
Elcom
Posts: 15
Joined: Wed Jul 15, 2020 3:15 am

Re: Service management/Initial State

Post by Elcom »

Hi vtrac,
Thank you for your reply, my args seem to run ok it is just how I force the script to only run on 'critical' and I am unsure how to put a condition for this in my PS1 script so that it only runs at critical.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Service management/Initial State

Post by vtrac »

Hi Elcom,
The "check_ncpa.py" is just a remote script which is called by Nagios XI to perform certain requirements like CPU or disk check.

In this case, it is asked to perform (run) a PowerShell script. Your PS script will or should return either "0" (OK), "1" (Warning) or "2" (Critical) for example as a return status.

I am still NOT understand what you want to do. I know that this script will wait 5 minutes then restart your services but what is the purpose of this service in the first place. You are not checking anything at all but just call your custom script to run.

Regards,
Vinh
Elcom
Posts: 15
Joined: Wed Jul 15, 2020 3:15 am

Re: Service management/Initial State

Post by Elcom »

Good morning Vinh,
The purpose of the script is to restart a service created on the Windows server when it does not see a connected drive. So in Nagios XI my service reports that it cannot see a string on the connected drive it changes to critical and when it changes to critical on the Windows server it stops myservice waits 5 seconds then starts myservice. The only issue I have currently is it runs the script and restarts myservice when my XI service reports a critical error AND when the service changes back to OK.

You mentioned previously I could put a condition in my PS1 script, can you advise what I would need to put in there to only restart myservice when XI reports as critical and not OK?

Elcom
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Service management/Initial State

Post by vtrac »

Hi Elcom,
I'm sorry, but I don't see any checking here.
You setup a service on Nagios XI that will execute the below command every few minutes or however the checking interval is:

Code: Select all

$USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename myservice -waittorestart 5"
Please explain to me what or where is the "check" based on the command above?

From what I see, "myscript.ps1" will runs every time the Nagios XI's service is being called.

I am not a PowerShell programmer, but you should check for the "connected drive" inside your "myscript.ps1" every time its ran.
If a drive is not found, then restart the service and return the status "0", if the restart is a success.
Here's an example:

Code: Select all

    If (!(Test-Path X:))
    {
        wait 5 minutes then map the drive;
        exit 0;
    }
    else { Write-Host "The X: drive is already in used."}
Regards,
Vinh
Elcom
Posts: 15
Joined: Wed Jul 15, 2020 3:15 am

Re: Service management/Initial State

Post by Elcom »

Hi Vinh,
I am sorry if I did not explain properly I will try now.

XI has a service setup for myserver, this runs every few mins and checks for access to a file on a connected drive in the args, if all is well nothing happens.

If the file is not accessible an event handler is called which runs a command as you saw previously

$USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename myservice -waittorestart 5"

When this command is called by the event handler it restarts a service on myserver which reconnects the drive so the next time the check runs and the file is accessible then there is no call to the event handler and then to myscript.ps1

I need to understand how to only call the script on the server once whether that is in myscript.ps1 or if there is a way in my args in my service for the server to only run when there is no access to the file.

Would something like $SERVICESTATE$ "Critical" as an arg in my service force this to only run when the service is critical?

Thanks for your replies
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Service management/Initial State

Post by vtrac »

Hi Elcom,
Thank you for your details explanation, now I understand what you are trying to do .... :-)

Yes, looks like you will need to pass in the "$SERVICESTATE$" to your "myscript.ps1" script, then check the state change with the if/then.
Here's an example using shell script (sorry, I'm not a PS programmer):

Code: Select all

SERVICESTATE=$1
if [["$SERVICESTATE" == "CRITICAL" ]]
then
    restart your sevice
    exit 0
else
    do nothing
    exit 0
fi
I think you get the picture here ... :-)

Here's a document that will give you more details about Event Handlers:
https://assets.nagios.com/downloads/nag ... ios-XI.pdf


Regards,
Vinh
Locked