Page 1 of 1

NCPA Listener Crashes

Posted: Thu Mar 03, 2022 4:40 pm
by jameyw
I have run into an issue with NCPA listener agent crashing. I noticed that on 2 different Windows system that all of the monitored services were critical. They all reported "Service check timed out after 60.01 seconds". If I go to the event log, I find the following:

Faulting application name: ncpa_listener.exe, version: 2.4.0.0, time stamp: 0x549debde
Faulting module name: MSVCR90.dll, version: 9.0.30729.9625, time stamp: 0x5db2747f
Exception code: 0xc0000417
Fault offset: 0x00036bf4
Faulting process id: 0x2404
Faulting application start time: 0x01d82aa06cba468d
Faulting application path: C:\Program Files (x86)\Nagios\NCPA\ncpa_listener.exe
Faulting module path: C:\WINDOWS\WinSxS\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.9625_none_508ef7e4bcbbe589\MSVCR90.dll
Report Id: 95787a20-b5cd-4482-9368-d4a8d467dcc7
Faulting package full name:
Faulting package-relative application ID:

and I also find this:

Fault bucket 1290427284676651299, type 5
Event Name: BEX
Response: Not available
Cab Id: 0

Problem signature:
P1: ncpa_listener.exe
P2: 2.4.0.0
P3: 549debde
P4: MSVCR90.dll
P5: 9.0.30729.9625
P6: 5db2747f
P7: 00036bf4
P8: c0000417
P9: 00000000
P10:

Attached files:
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER5D63.tmp.dmp
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER5E20.tmp.WERInternalMetadata.xml
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER5E40.tmp.xml
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER5E3E.tmp.csv
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER5E6E.tmp.txt

These files may be available here:
\\?\C:\ProgramData\Microsoft\Windows\WER\ReportArchive\AppCrash_ncpa_listener.ex_f2e0121fbe2ec12174e2dd75e6474cd5056bf63_6f7866d5_71a75f42-8dc1-45f0-991c-f5578cdba1ab

Analysis symbol:
Rechecking for solution: 0
Report Id: 95787a20-b5cd-4482-9368-d4a8d467dcc7
Report Status: 268435456
Hashed bucket: b5c3678132b9db76f1e884c7479fa923
Cab Guid: 0

Looking at the event timestamp, I know that I plugged a read-only USB flash drive in around that time. I did this on both machines reporting errors. On one machine, the NCPA listener service is not running and if you start it, it stops almost immediately. That machine still has the USB drive plugged in. On the other machine, the drive was removed and the NCPA Listener Service is running but NagiosXI still reports a timeout. Restarting the service did not help. Next option is to reboot but I can't reboot it right now.

Re: NCPA Listener Crashes

Posted: Thu Mar 03, 2022 5:09 pm
by gsmith
Hi

Not sure if plugging in the USB drive is a cause or a redherring. First and foremost
I would scan the USB drive for viruses/malware/trojans.

Next, since you can't reboot the Windows machines you should be able to go to the Control Panel, Programs and
Features, right-click on NCPA and uninstall it. Then in a web browser search on "NCPA download", download it, and
install it. I just ran through that exercise and was not required to restart my Windows machine.

I would do this re-install without the USB drive plugged in. Then reconfigure the host/service you are monitoring
from Nagios XI if required.

Please let me know if that gets the monitoring of the Windows machines back online, as since you can't bounce
the Windows machines I am guessing they are in Production.

Once you report back with your status I can try to replicate the problem of using an USB drive
with NCPA service, but I want to make sure you get your monitoring back on line before digging for the root cause.

Thanks

Re: NCPA Listener Crashes

Posted: Fri Mar 04, 2022 12:06 pm
by jameyw
I rebooted one of the computers this morning and the NCPA checks started working again. After rebooting, I plugged the USB drive in and confirmed that plugging the drive in is the cause. This is a read-only USB flash drive that is provided by one of our software vendors. It contains updates for their product. Initially, I plugged the drive in to get the documentation. This morning, I plugged it in just to test if it was the cause. On the next check, service checks again reported Plugin Timeout. I changed NCPA Listener logging to debug, rebooted and plugged the drive in again. After that, nothing was logged in the log file. Restarting the service had no effect. Removing the drive had no effect. I rebooted and tried a different read-only drive with the same results. The drives are clean. Neither Windows Defender nor VMware Carbon Black found any issues with the drives.

Rebooting gets monitoring back online. They are production machines but mine I can reboot easier than the other if needed.

I did some searching on the Internet and Google found this thread https://github.com/NagiosEnterprises/ncpa/issues/740 on Github referring NCPA crashing after a read-only disk was plugged in.

Re: NCPA Listener Crashes

Posted: Fri Mar 04, 2022 3:19 pm
by gsmith
Hi,

I was able to repeat the behavior you described. I will create a bug report on your behalf.
Since this is an "edge case" I am not sure what priority the bug will have as that is at
the discretion of our development team.


I did discover that I was able to remove the read-only usb drive and then restart the service, so a re-boot
doesn't seem necessary. I am running Windows 10 Pro 21H2 19044.1526

Thank you!