Page 1 of 1

Service management/Initial State

Posted: Fri Feb 12, 2021 11:56 am
by Elcom
Hi, I am running a powershell script which stops and starts a service, it stops a service, waits a number of seconds then restarts the service and sends an email as below

Service enters critical state

Calls this command
$USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename myservice -waittorestart 5"

Which in turn calls myscript.ps1

stop-Service $servicename
start-sleep $waittorestart
start-Service $servicename
postie -host:my host -to:my to email address -from:my from email address -s:"My subject" -msg:"my message here"

My issue is when I call this script it seems to run and restart the service when the check enters critical and then runs again when it changes back to OK from critical

Is there any way to configure my service perhaps using "initial state" in Service Management? I want this script to only run once at critical and not run again after the next check interval when it is in the OK state.

Re: Service management/Initial State

Posted: Mon Feb 15, 2021 11:15 am
by vtrac
Hi Elcom,
Based on what you have setup, the "check_ncpa.py" script will runs your "myscript.ps1" script no matter what the status because there is no defined condition for the "check_ncpa.py" to uses.

You might want to put the condition inside your "myscript.ps1" as to when to restart services.

As to calling the "check_ncpa.py", I believe "args" should be separated by comma like this:

Code: Select all

check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename myservice,args=-waittorestart 5"
or with quote:

Code: Select all

check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename 'myservice',args=-waittorestart '5'"
Regards,
Vinh

Re: Service management/Initial State

Posted: Mon Feb 15, 2021 11:33 am
by Elcom
Hi vtrac,
Thank you for your reply, my args seem to run ok it is just how I force the script to only run on 'critical' and I am unsure how to put a condition for this in my PS1 script so that it only runs at critical.

Re: Service management/Initial State

Posted: Mon Feb 15, 2021 12:27 pm
by vtrac
Hi Elcom,
The "check_ncpa.py" is just a remote script which is called by Nagios XI to perform certain requirements like CPU or disk check.

In this case, it is asked to perform (run) a PowerShell script. Your PS script will or should return either "0" (OK), "1" (Warning) or "2" (Critical) for example as a return status.

I am still NOT understand what you want to do. I know that this script will wait 5 minutes then restart your services but what is the purpose of this service in the first place. You are not checking anything at all but just call your custom script to run.

Regards,
Vinh

Re: Service management/Initial State

Posted: Tue Feb 16, 2021 4:04 am
by Elcom
Good morning Vinh,
The purpose of the script is to restart a service created on the Windows server when it does not see a connected drive. So in Nagios XI my service reports that it cannot see a string on the connected drive it changes to critical and when it changes to critical on the Windows server it stops myservice waits 5 seconds then starts myservice. The only issue I have currently is it runs the script and restarts myservice when my XI service reports a critical error AND when the service changes back to OK.

You mentioned previously I could put a condition in my PS1 script, can you advise what I would need to put in there to only restart myservice when XI reports as critical and not OK?

Elcom

Re: Service management/Initial State

Posted: Tue Feb 16, 2021 3:29 pm
by vtrac
Hi Elcom,
I'm sorry, but I don't see any checking here.
You setup a service on Nagios XI that will execute the below command every few minutes or however the checking interval is:

Code: Select all

$USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename myservice -waittorestart 5"
Please explain to me what or where is the "check" based on the command above?

From what I see, "myscript.ps1" will runs every time the Nagios XI's service is being called.

I am not a PowerShell programmer, but you should check for the "connected drive" inside your "myscript.ps1" every time its ran.
If a drive is not found, then restart the service and return the status "0", if the restart is a success.
Here's an example:

Code: Select all

    If (!(Test-Path X:))
    {
        wait 5 minutes then map the drive;
        exit 0;
    }
    else { Write-Host "The X: drive is already in used."}
Regards,
Vinh

Re: Service management/Initial State

Posted: Wed Feb 17, 2021 3:55 am
by Elcom
Hi Vinh,
I am sorry if I did not explain properly I will try now.

XI has a service setup for myserver, this runs every few mins and checks for access to a file on a connected drive in the args, if all is well nothing happens.

If the file is not accessible an event handler is called which runs a command as you saw previously

$USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'mytoken' -P 5693 -M 'plugins/myscript.ps1' -q "args=-servicename myservice -waittorestart 5"

When this command is called by the event handler it restarts a service on myserver which reconnects the drive so the next time the check runs and the file is accessible then there is no call to the event handler and then to myscript.ps1

I need to understand how to only call the script on the server once whether that is in myscript.ps1 or if there is a way in my args in my service for the server to only run when there is no access to the file.

Would something like $SERVICESTATE$ "Critical" as an arg in my service force this to only run when the service is critical?

Thanks for your replies

Re: Service management/Initial State

Posted: Wed Feb 17, 2021 3:51 pm
by vtrac
Hi Elcom,
Thank you for your details explanation, now I understand what you are trying to do .... :-)

Yes, looks like you will need to pass in the "$SERVICESTATE$" to your "myscript.ps1" script, then check the state change with the if/then.
Here's an example using shell script (sorry, I'm not a PS programmer):

Code: Select all

SERVICESTATE=$1
if [["$SERVICESTATE" == "CRITICAL" ]]
then
    restart your sevice
    exit 0
else
    do nothing
    exit 0
fi
I think you get the picture here ... :-)

Here's a document that will give you more details about Event Handlers:
https://assets.nagios.com/downloads/nag ... ios-XI.pdf


Regards,
Vinh