Page 1 of 2

NCPA 2.4.0 agent issue

Posted: Fri Jan 28, 2022 3:00 am
by sahilrana
We updated the NPCA agent to the latest version 2.4.0 last week and we have found an issue. The agent is fetching RAM data from the VMs.
All other data is being fetched correctly. It's only the RAM data from the VMs not being fetched.
Is this a known issue?

We installed the older NCPA agent 2.3.1 and it started fetching the RAM data so there is some issue with 2.4.0 agent.

Re: NCPA 2.4.0 agent issue

Posted: Fri Jan 28, 2022 4:22 pm
by kfanselow
Hi sahilrana,

Are the checks failing in their entirety or are you seeing a performance graph issue similar to this ?

https://github.com/NagiosEnterprises/ncpa/issues/845

Also could you provide the output from a "Run Check Command" with the token redacted as in the example below:

Navigate via configure (top) -> Core Config Manager -> Serivces (left).

Find the "Memory Usage" service for one of the hosts with 2.4.0 installed and select Edit (wrench icon on the right side). The click on the run command button and it's follow up prompt . The output should look like this (note the token and IP have been redacted from the string)

Code: Select all

[nagios@kf-centos-79 ~]$ /usr/local/nagios/libexec/check_ncpa.py -H REDACTED -t 'REDACTED' -P 5693 -M memory/virtual -u 'Gi' -w '50' -c '80'
CRITICAL: Memory usage was 88.90 % (Available: 0.40 GiB, Total: 3.65 GiB, Free: 0.13 GiB, Used: 2.85 GiB) | 'available'=0.40GiB;;; 'total'=3.65GiB;;; 'free'=0.13GiB;;; 'used'=2.85GiB;;;
Thanks and Best Regards,
Keith

Re: NCPA 2.4.0 agent issue

Posted: Tue Feb 01, 2022 1:56 am
by sahilrana
Hi Keith,

Yes, it's the same issue as mentioned in the link. RAM data is not there in the performance graphs.

Here is the output.

[[email protected] ~]$ /usr/local/nagios/libexec/check_ncpa.py -H x.x.x.x -T 119 -t 'token' -P 5693 -M memory/virtual -u 'Gi' -w '80' -c '90'
OK: Memory usage was 26.50 % (Available: 23.53 GiB, Total: 32.00 GiB, Free: 23.53 GiB, Used: 8.47 GiB) | 'available'=23.53GiB;;; 'total'=32.00GiB;;; 'percent'=26.50%;80;90; 'free'=23.53GiB;;; 'used'=8.47GiB;;;

Re: NCPA 2.4.0 agent issue

Posted: Tue Feb 01, 2022 4:06 pm
by kfanselow
Hi sahilrana,

Thanks for confirming. I was able to replicate the difference in output and we suspect the issue may have to do with a mismatch in the number of inputs for the existing round robin database. Could you confirm which version of Nagios XI you are using ?


Thanks and Best Regards,
Keith

Re: NCPA 2.4.0 agent issue

Posted: Wed Feb 02, 2022 1:43 am
by sahilrana
Hi Keith,

NagiosXi version is 5.8.7. I think its the latest one.

Re: NCPA 2.4.0 agent issue

Posted: Wed Feb 02, 2022 6:04 pm
by kfanselow
Hi sahilrana,

I'm filing a bug report on the issue. After discussing it with our developers there are two options in the mean time:

1) Stay at NCPA version 2.3.1

2) You can remove the rrd and xml file for the memory usage graphs and it should start over with the updated number of data sources.

If you would like to use the second option you can find the rrd and xml files in the host subdirectory of perfdata on your XI server ( see below - change HOSTNAME to the hostname or IP of the remote system )

Code: Select all

/usr/local/nagios/share/perfdata/HOSTNAME
For example:

Code: Select all

 
rm /usr/local/nagios/share/perfdata/10.1.2.3/Memory_Usage.rrd
rm /usr/local/nagios/share/perfdata/10.1.2.3/Memory_Usage.xml 


Hope this is useful.

Thanks and Best Regards,
Keith

Re: NCPA 2.4.0 agent issue

Posted: Mon Feb 07, 2022 10:37 pm
by sahilrana
Hi Keith,

I used the second option and I am getting the error that the files donot exist. I am assuming the hostname or IP addresss in the command is of the remote server where agent is installed.
The server I tried has 2.4.0 agent installed. Please see the attached error.

Re: NCPA 2.4.0 agent issue

Posted: Tue Feb 08, 2022 8:08 pm
by ssax
The directory name is based on the host name, is the host name in XI for this server the IP address or something else?
- It's likely something else so you'd need to use that something else in place of THEHOSTNAME in the commands below

Code: Select all

rm /usr/local/nagios/share/perfdata/THEHOSTNAME/Memory_Usage.rrd
rm /usr/local/nagios/share/perfdata/THEHOSTNAME/Memory_Usage.xml 

Re: NCPA 2.4.0 agent issue

Posted: Wed Feb 09, 2022 12:34 am
by sahilrana
I tried both with IP address and hostname.
In anycase this is not a feasible alternate since this command is required to be run against all hostnames, right?

For now, I am rolling back to the previous version (2.3.1).
Is there any expected time this bug issue resolution for 2.4.0 agent?

Re: NCPA 2.4.0 agent issue

Posted: Wed Feb 09, 2022 5:51 pm
by ssax
We're unable to give an ETA at this time, development is aware of the issue though.

Development would be alerted to the bug report updates as well:

https://github.com/NagiosEnterprises/ncpa/issues/845

This would technically fix it but don't run this:

https://support.nagios.com/kb/article/n ... g-149.html

But because the ordering of them is different the resulting data will not be correct:

Code: Select all

2.3.x: | 'available'=0.89GiB;;; 'total'=1.80GiB;;;  'free'=0.17GiB;;; 'used'=0.65GiB;;;
2.4.0: | 'available'=0.89GiB;;; 'total'=1.80GiB;;; 'percent'=50.50%;80;90; 'free'=0.17GiB;;; 'used'=0.65GiB;;;
What it would do is add a datasource to the RRD on the end and then all data would be shifted over, so the new percent one would have the old free data which would through mess up the data.

Usually if you look in the .xml file if you have issues, this section would have an error in it:

Code: Select all

  <RRD>
    <RC>0</RC>
    <TXT>successful updated</TXT>
  </RRD>