Page 1 of 1

NSCA Sending Wrong Service Data

Posted: Sun Jun 17, 2012 9:36 am
by mikew
Basically NSCA is sending data correctly but adding the extra, wrong information in the service checks.

Set Up:
The set up is two Nagios XI servers using version 2011R3.1. The master XI server is sending to the slave using NSCA. Both are using the same password and 3DES for encryption. Both have the latest version of NSCA installed, version 2.9.1 and the master has been configured to send outbound transfers to the slave and the slave has been configured to accept inbound transfers. The slave is not performing any host or service checks.

The problem does not occur on every check and can come and go, looking random. The servers have been restarted, nagios restarted and xinetd restarted on the host. The slave was built based on a complete backup of the master so there are no differences in the /usr/local/nagios or /usr/local/nagiosxi directories and the databases were all installed based on backups from the master. The only difference is the NSCA set up for each.

Here is additional information from the source of the page:

Code: Select all

<div class="servicestatusdetailinfotext">Login Errors since last reboot is 0</div><div class="servicestatusdetailinfotextlong">cisco_switch	FastEthernet0/22 Bandwidth	0	OK - Current BW in: 0Mbps Out: 0Mbps</div></div>
You can see servicestatusdetail text and textlong do not match as the first is a Windows 2008 r2 server logins and the second is a cisco switch.

The image demonstrates the problem, Windows server with Cisco switch info. The first line is correct, the Cisco information should not be included.

Image

I did additional testing for this problem by stopping the slave and copying the retention.dat file from the master onto the slave and starting the slave. Initially, this fixed all problems but as NSCA sent information it became corrupt again.

Re: NSCA Sending Wrong Service Data

Posted: Mon Jun 18, 2012 9:47 am
by scottwilkerson
Just on Friday I had notices some unexpected behavior with NSCA 2.91 that we are going to have to dig into...

Re: NSCA Sending Wrong Service Data

Posted: Tue Jun 19, 2012 4:15 pm
by mikew
Upon a great deal of testing here is a working solution.

NSCA version 2.9 had several new features added including support for multi-line check output with a 4000 character limit. Ultimately, this led to a larger packet size. However, this feature has created unwanted results in that multiple lines with bad information are created in the service checks as listed in the post. This does not see to alter the real data, just unwanted random data from other service checks...will make you go crazy.

Solution:

The solution is to compile NSCA 2.7.2 on all clients sending NSCA information to the NSCA daemon, which can run 2.9.1. The NSCA daemon has been configured to allow the smaller, and correct, packets to be processed on the NSCA 2.9.1 daemon. On a Nagios XI server the send_nsca client binary must be replaced in the plugins directory. Sanity returns. :D

Re: NSCA Sending Wrong Service Data

Posted: Wed Jun 20, 2012 11:06 am
by scottwilkerson
Thanks for the update Mike.

We are looking into what may be causing this in the 2.9.1 send_nsca