Page 1 of 2

Strange "Unknown" errors with check_wsc

Posted: Wed Jun 05, 2013 3:58 pm
by kelemvor
So we have a strange issue that I thought maybe someone had seen before.

Starting a week or so ago, we're getting a lot of "Unknown" errors from Nagios. They all say something like:
check_wsc UNKNOWN: Problem getting service response message, code=500, message=read failed: Connection reset by peer
We apparently have Nagios setup so that our Windows Domain Controller handles the actual SNMP polling of the servers since it can see them all, and Nagios pulls its info from the DC.
We noticed that if someone remotes into the DC and logs on, the Unknown errors all stop. Then as soon as that person logs off so no one is actively remote into the machine, the Unknown errors come back. It's as if remoting into the machine wakes it up from a half asleep state so the communication starts working better.

We've had this up and running for years and this just started happening a week or so ago which happens to coincide with some Windows Patching we did. However, I don't see anything in the patch descriptions that it would have this affect on what we're doing.

Has anyone seen anything like this before and do you have any idea what might be causing it and maybe how to fix it? Normally it just gets the Unknown error once and the next check will work but sometimes it fails 2 or 3 times in a row which trigger an alert email to go out. We'd obviously not have people get woken up at night for no reason. ;)

Re: Strange "Unknown" errors with check_wsc

Posted: Wed Jun 05, 2013 5:00 pm
by slansing
Strange, do you by any chance have a text listing of the patches you pushed to these machines? What version if windows are these systems running on, are they all on the same version?

Re: Strange "Unknown" errors with check_wsc

Posted: Thu Jun 06, 2013 9:21 am
by kelemvor
According to Powershell, the patches that got installed were:

Security Update KB2830290 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2829530 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2840149 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2772930 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2847204 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2829361 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2813170 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2804579 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Update KB2798162 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Update KB2820331 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2820197 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM
Security Update KB2813347 NT AUTHORITY\SYSTEM 5/26/2013 12:00:00 AM

Windows Update History also has:
KB890830
KB2804576

The DC is running Server 2008 R2 if that matters. :)

Thanks.

Re: Strange "Unknown" errors with check_wsc

Posted: Fri Jun 07, 2013 10:18 am
by sreinhardt
Just a quick note, I am just beginning to take a look at this, however the check_wsc does not use snmp but instead wmi queries. If any of these updates effect wmi or .net framework this is likely our culprit. I will begin looking through them.

Re: Strange "Unknown" errors with check_wsc

Posted: Fri Jun 07, 2013 10:49 am
by sreinhardt
I would suggest looking at removing theses patches, in this order and testing one by one. Most likely this is an issue with the dot net xml issue fix, and neither of the other two. However I also included the rdp one as you mentioned it specifically runs fine with a user logged in, and the AD one as you mentioned this specifically runs on a DC.

KB2804576
http://technet.microsoft.com/security/bulletin/MS13-040
dot net 2sp2-4.5 xml spoofing ***

Security Update KB2813347
http://technet.microsoft.com/security/bulletin/MS13-029
RDP RCE fix

Security Update KB2772930
http://technet.microsoft.com/security/bulletin/MS13-032
Active directory DOS issues ***

Additionally I wanted to ask if the user that gets logged in via rdp also happens to be the same one running .net app?

Re: Strange "Unknown" errors with check_wsc

Posted: Fri Jun 07, 2013 12:36 pm
by Nav
I'm seeing the same issue but it doesn't look like we applied any of the updates you listed. I'm running version 2.9.

Re: Strange "Unknown" errors with check_wsc

Posted: Fri Jun 07, 2013 3:35 pm
by sreinhardt
2.9 of what? Can you please create a separate thread, I do not want to hijack kelemvor's thread and not resolve that issue.

Re: Strange "Unknown" errors with check_wsc

Posted: Fri Jun 07, 2013 3:48 pm
by kelemvor
I've passed those patches to the guy on my team who's looking into this. As for the ID logging on, any of us can remote into the DC with our own IDs and the problem goes away. It's not related to any specific user.

Thanks

Re: Strange "Unknown" errors with check_wsc

Posted: Fri Jun 07, 2013 4:19 pm
by sreinhardt
Interesting, let us know what they find, if I get a chance I might do some testing over the weekend.

Re: Strange "Unknown" errors with check_wsc

Posted: Fri Jun 07, 2013 6:12 pm
by Nav
Nagios 2.9 I believe. Either way I'll just hang back and lurk I think we appear to have the same issue.