NSCA Sending Wrong Service Data

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
mikew
Posts: 243
Joined: Sun Feb 05, 2012 7:05 pm

NSCA Sending Wrong Service Data

Post by mikew »

Basically NSCA is sending data correctly but adding the extra, wrong information in the service checks.

Set Up:
The set up is two Nagios XI servers using version 2011R3.1. The master XI server is sending to the slave using NSCA. Both are using the same password and 3DES for encryption. Both have the latest version of NSCA installed, version 2.9.1 and the master has been configured to send outbound transfers to the slave and the slave has been configured to accept inbound transfers. The slave is not performing any host or service checks.

The problem does not occur on every check and can come and go, looking random. The servers have been restarted, nagios restarted and xinetd restarted on the host. The slave was built based on a complete backup of the master so there are no differences in the /usr/local/nagios or /usr/local/nagiosxi directories and the databases were all installed based on backups from the master. The only difference is the NSCA set up for each.

Here is additional information from the source of the page:

Code: Select all

<div class="servicestatusdetailinfotext">Login Errors since last reboot is 0</div><div class="servicestatusdetailinfotextlong">cisco_switch	FastEthernet0/22 Bandwidth	0	OK - Current BW in: 0Mbps Out: 0Mbps</div></div>
You can see servicestatusdetail text and textlong do not match as the first is a Windows 2008 r2 server logins and the second is a cisco switch.

The image demonstrates the problem, Windows server with Cisco switch info. The first line is correct, the Cisco information should not be included.

Image

I did additional testing for this problem by stopping the slave and copying the retention.dat file from the master onto the slave and starting the slave. Initially, this fixed all problems but as NSCA sent information it became corrupt again.
Mike Weber

Nagios Training/Consulting
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NSCA Sending Wrong Service Data

Post by scottwilkerson »

Just on Friday I had notices some unexpected behavior with NSCA 2.91 that we are going to have to dig into...
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
mikew
Posts: 243
Joined: Sun Feb 05, 2012 7:05 pm

Re: NSCA Sending Wrong Service Data

Post by mikew »

Upon a great deal of testing here is a working solution.

NSCA version 2.9 had several new features added including support for multi-line check output with a 4000 character limit. Ultimately, this led to a larger packet size. However, this feature has created unwanted results in that multiple lines with bad information are created in the service checks as listed in the post. This does not see to alter the real data, just unwanted random data from other service checks...will make you go crazy.

Solution:

The solution is to compile NSCA 2.7.2 on all clients sending NSCA information to the NSCA daemon, which can run 2.9.1. The NSCA daemon has been configured to allow the smaller, and correct, packets to be processed on the NSCA 2.9.1 daemon. On a Nagios XI server the send_nsca client binary must be replaced in the plugins directory. Sanity returns. :D
Mike Weber

Nagios Training/Consulting
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NSCA Sending Wrong Service Data

Post by scottwilkerson »

Thanks for the update Mike.

We are looking into what may be causing this in the 2.9.1 send_nsca
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked