Return code of 66 for service 'Windows System Services

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
kqWM4tgVX66wUbLu
Posts: 5
Joined: Tue Mar 30, 2021 9:18 am

Return code of 66 for service 'Windows System Services

Post by kqWM4tgVX66wUbLu »

Hello.

Has anyone seen this before, in Nagios XI one Windows host that is failing the Windows System Services check?

Host is Windows 2019.

Message: - Service State is Critical. - Error is "Return code of 66 for service 'Windows System Services".

I checked the services and no automatic services are stopped all are running.

I restarted the Nagios XI services and also checked "C:\Program Files (x86)\Nagios\NCPA\plugins" and the plugin "update_check_services.ps1" is present.

If I copy that PowerShell script and modify it commenting out the "on success" exit line then manually run it I get a message saying all services are started.

I then deleted that host from Nagios XI and re-added back in but after the services check I get the same issue.

This is the only server that has the issue.

Peter
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Return code of 66 for service 'Windows System Services

Post by pbroste »

Hello @kqWM4tgVX66wUbLu

Thanks for reaching out, want to have you run the "update_check_services.ps1" via command line so we can get the results so we can see what is going on.

Please enter the parameters that you were using for the args:

Code: Select all

/usr/local/nagios/libexec/check_ncpa.py -H yourhostaddresshere -t yourtokenhere -M '/plugins/update_check_services.ps1'  -w xx -c xx
Let us know how things look,
Perry
kqWM4tgVX66wUbLu
Posts: 5
Joined: Tue Mar 30, 2021 9:18 am

Re: Return code of 66 for service 'Windows System Services

Post by kqWM4tgVX66wUbLu »

Hi,

Thanks for replying.

Typical IT thing, when I checked that host this morning in Nagios XI no error 66 is being logged and everything is green.

I have run the command though as you requested, the output is the same as when I manually ran the PowerShell script "update_check_services" on the host when it had the issue:

Results:

OK: All services running | ServicesRunning=107;0;0;0;0

Peter
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Return code of 66 for service 'Windows System Services

Post by pbroste »

Hello [user]

Thanks for following up, appears that Murphy's Law

The option to add timeout on the 'ncpa' service check by adding -t 120, just in case we are not providing enough time to get results back.
-T TIMEOUT, --timeout=TIMEOUT
Enforced timeout, will terminate plugins after this
amount of seconds. [60]
Please let us know if you need anything further.

Thanks,
Perry
kqWM4tgVX66wUbLu
Posts: 5
Joined: Tue Mar 30, 2021 9:18 am

Re: Return code of 66 for service 'Windows System Services

Post by kqWM4tgVX66wUbLu »

Thanks.

If it happens again I will let you know though I will also try that timeout setting.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Return code of 66 for service 'Windows System Services

Post by pbroste »

Sounds like a plan,
Perry
kqWM4tgVX66wUbLu
Posts: 5
Joined: Tue Mar 30, 2021 9:18 am

Re: Return code of 66 for service 'Windows System Services

Post by kqWM4tgVX66wUbLu »

Hi Perry,

The problem reared its head again this morning.

Manually running

/usr/local/nagios/libexec/check_ncpa.py -H host -t token -M '/plugins/update_check_services.ps1' -w xx -c xx

from the Nagiox XI server returned no output, just an empty new line.

Running "check_update_services" PowerShell script on the host Windows server returned the message that "All Services Are Running".

I then restarted the NCPA Listener and NCPA Passive services on the host and re-ran check_ncpa.py from the Nagios XI server and it then returned:

OK: All services running | ServicesRunning=122;0;0;0;0

I didn't get a chance to try the -T option though I get the impression that wouldn't have helped but is that something I can append to the check_ncpa.py command?

check_ncpa.py -H host -t token -M '/plugins/update_check_services.ps1' -w xx -c xx

Peter
You do not have the required permissions to view the files attached to this post.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Return code of 66 for service 'Windows System Services

Post by pbroste »

Hello @kqWM4tgVX66wUbLu

Thanks for following up, strange that there is no output on the full command. Nicely done we know that the PowerShell script works independently.

Wondering if something with the ncpa_listener service is causing it to get "stuck". Next time toss a --verbose on that 'ncpa_check' command to see what you get for results. But before running that run a check on the service status on the ncpa_listener to see if that is wonky.

Thanks,
Perry
kqWM4tgVX66wUbLu
Posts: 5
Joined: Tue Mar 30, 2021 9:18 am

Re: Return code of 66 for service 'Windows System Services

Post by kqWM4tgVX66wUbLu »

Hi Perry,

Same thing again this morning.

Not critical as my observation is that if I do nothing eventually the Services check shows green all good but interesting to know what is causing the 66 error.

The NCPA Listener and Passive services are running,

I added the verbose switch to the check_ncpa.py command and got:

File returned contained:
{
"returncode": -1073741502,
"stdout": ""
}

Not sure what that means.


Peter
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Return code of 66 for service 'Windows System Services

Post by pbroste »

Hello [user]kqWM4tgVX66wUbLu[/code]

Thanks for following up, and sending the error response.

Typically when we see 'returncode' error messages we suspect that there was a problem with the return code received. More than likely not able to decipher what was returned.

May want increase the time to run the ncpa check. The option to add a switch to the command to extend the timeout and/or configure in the '/usr/local/ncpa/etc/ncpa.cfg'. Set to 120 and remove the '#' to the start of the line *(uncommit).
-T TIMEOUT, --timeout=TIMEOUT
Enforced timeout, will terminate plugins after this
amount of seconds. [60]
Go forward we want to log the 'ncap_listener' and look at the --Debug logging on the 'check_ncap.py' command.
  • Find the PID on the 'ncpa_listener':
  • [list]
  • ps -aux | grep -Ei 'ncpa_listener'
[*]Run strace to review the exit:[/*]
  • strace -p [pid]
[/list]

Run the 'check_ncpa' command with the --debug to produce more results.

Thanks,
Perry
Locked