Page 1 of 2

New NCPA aggregate=avg function killing graphs

Posted: Thu Apr 23, 2015 1:22 pm
by krobertson71
We are no longer getting graph data on CPU checks after we added the -q aggregate=avg to it.
Selection_237.png
Is there a way to correct this behaviour? Maybe delete the perf data? Want to get some suggestions on this before I try anything. I have already deleted the CPU check and recreated with the -q aggregate=avg built in fresh, to no avail.

Re: New NCPA aggregate=avg function killing graphs

Posted: Thu Apr 23, 2015 5:12 pm
by tgriep
Lets see what the perf data is returning now. Go to that service and click on the advanced tab, screen capture it and post it here.

Re: New NCPA aggregate=avg function killing graphs

Posted: Thu Apr 23, 2015 7:14 pm
by krobertson71
Here it is. I think the issue is with performance data being returned. This is a 12 core server. Here is the check command:

check_xi_ncpa_agent!-t 'mytoken' -P 5693 -M cpu/percent -w 90 -c 97 -q 'aggregate=avg'
Selection_238.png
The aggregation works fine, but since we implemented the change to the currently existing checks, the graph now shows no data. We also tried to completely remove the check and redo with the -q aggregate=avg from the beginning. Worked for 1 check then stopped putting anything into the graph.

Again the check itself does work, but not being able to graph the data kills reporting and capacity planning.

Re: New NCPA aggregate=avg function killing graphs

Posted: Thu Apr 23, 2015 7:20 pm
by Box293
I think the perfdata logs might shed some light onto this.

Follow the steps here:
http://support.nagios.com/wiki/index.ph ... h_Problems
To increase logging. Then submit another check result and wait about 15 minutes.

After that attach the log files here.

Re: New NCPA aggregate=avg function killing graphs

Posted: Fri Apr 24, 2015 8:49 am
by krobertson71
log attached.

Did not see any errors per say, but did see one host that has two cores and shows up like this now in the graph, with no data. Notice the labels at bottom of graph.
Selection_239.png

Re: New NCPA aggregate=avg function killing graphs

Posted: Fri Apr 24, 2015 10:28 am
by lmiltchev
What is the version of the NCPA agent that you are currently running? Did you upgrade to 1.8.1?

Re: New NCPA aggregate=avg function killing graphs

Posted: Fri Apr 24, 2015 10:49 am
by krobertson71
Sorry yes.

Re: New NCPA aggregate=avg function killing graphs

Posted: Fri Apr 24, 2015 11:14 am
by lmiltchev
I was not able to recreate the issue. I am using the same command as you:

Code: Select all

check_xi_ncpa_agent!-t 'mytoken' -P 5693 -M cpu/percent -w 90 -c 97 -q 'aggregate=avg'
Here's a test check from the CLI:

Code: Select all

./check_ncpa.py -H x.x.x.x -t mytoken -P 5693 -M cpu/percent -q 'aggregate=avg' -w 20 -c 40
OK: percent was 2% | 'percent_0'=2%;20;40;
and here is the graph:
example01.PNG
I would recommend deleting the RRD and the XML files, then waiting for 15-20 min. The RRD and the XML files will be recreated and the graph *should* show up.

Re: New NCPA aggregate=avg function killing graphs

Posted: Fri Apr 24, 2015 11:50 am
by krobertson71
Are you testing that on a multi cpu box. That graph looks to be a since cpu.

Re: New NCPA aggregate=avg function killing graphs

Posted: Fri Apr 24, 2015 12:07 pm
by lmiltchev

Code: Select all

lscpu
Architecture:          i686
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             4
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Stepping:              2
CPU MHz:               2793.000
BogoMIPS:              5586.00
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
What I see in the API:

Code: Select all

URL=https://192.168.x.x:5693/api/cpu/percent

Code: Select all

{
  "value": {
    "percent": [
      [
        4.0, 
        1.0, 
        4.0, 
        2.0
      ], 
      "%"
    ]
  }
}

Code: Select all

URL=https://192.168.x.x:5693/api/cpu/percent?aggregate=avg

Code: Select all

{
  "value": {
    "percent": [
      [
        0.75
      ], 
      "%"
    ]
  }
}