Page 1 of 1

No Perf Graph Generated

Posted: Tue Jan 16, 2018 9:46 am
by lzieziula
Hello,

I currently have a script that generates performance data from viz cards. The script is querying and obtaining the data but it is not keeping the data to generate the performance graph. I have a very similar script that does, however, generate the graph. Below is the source code for the script unable to generate a performance graph:

==========================================
import subprocess
pass_arg=[]
pass_arg.append("nvidia-smi")
pass_arg.append("--query-gpu=pci.bus_id,memory.used")
pass_arg.append("--format=csv")
output = subprocess.check_output(pass_arg).splitlines()
perfdata = []
data = []
data.append("4096")
perfdata.append("Total=4096")

for i in range(1,8):
string = output.split(',')
busid=string[0]
val=string[1].split()[0]
data.append(val)
#perfdata.append(busid + "=" + val)
perfdata.append("BUSID" + str(i) + "=" + val)

dataStr = ",".join(data)
perfdataStr = ",".join(perfdata)

print "OK-", dataStr, "|", perfdataStr
==========================================
Any thoughts? I think it has something to do with the format of the data and what Nagios expects it to be in order to generate the graph.

Re: No Perf Graph Generated

Posted: Tue Jan 16, 2018 12:08 pm
by mcapra
Can you show the output of this script?

It might also be useful to know how Nagios XI is executing the script (NRPE, NCPA, SSH, etc) and, if an agent is being used, what version of the agent.

Re: No Perf Graph Generated

Posted: Tue Jan 16, 2018 12:28 pm
by npolovenko
I agree with what @mcapra said. Please run the script in the command line and show us the output.

Re: No Perf Graph Generated

Posted: Tue Jan 16, 2018 2:07 pm
by lzieziula
Output is as followed from the CMD Line:

==========================================
OK-4096,166,87,87,87,87,107,87 | Total=4096,BUSID1=166,BUSID2=87,BUSID3=87,BUSID4=87,BUSID5=87,BUSID6=107,BUSID7=87
==========================================

As you can see, it is listing each of the BUSID's with their corresponding memory usage. I just need this data to generate a perf graph on nagios.

Re: No Perf Graph Generated

Posted: Tue Jan 16, 2018 3:17 pm
by npolovenko
@lzieziula, I think the values after | have to be separated by spaces:

Code: Select all

OK-4096,166,87,87,87,87,107,87 | Total=4096, BUSID1=166, BUSID2=87, BUSID3=87, BUSID4=87, BUSID5=87, BUSID6=107, BUSID7=87
So you'd need to change your plugin. Also, after you modify the plugin make sure to delete the old RRD and XML files from /usr/local/nagios/share/perfdata/ After that let Nagios run this plugin for 15 minutes before checking the graph. This will allow it to generate some data in the new RRD's.

Here's the developers reference for making Performance Graphs in XI: https://nagios-plugins.org/doc/guidelines.html Section is called "Performance data".

Re: No Perf Graph Generated

Posted: Wed Jan 17, 2018 2:47 pm
by lzieziula
So, I have a script also running with no spaces in between each BUSID like you mentioned might fix it that generates a graph and is virtually identical to this script. The only difference is how the graphic card is queried (one queries memory, one queries GPU). Nagios is executing both scripts via NRPE.

Here is how the performance data is being printed:
=====================================
perfdata.append("BUSID" + str(i) + "=" + val + "MB")
=====================================

^ Does the format of that need altered? Thats the only thing I can think of. As is, it prints BUSID + stri(i) which is an ID #1-#8, the value of the query and MB as the unit of measurement.

Re: No Perf Graph Generated

Posted: Wed Jan 17, 2018 3:53 pm
by npolovenko
@lzieziula, Did the first graph start working for you?
If I'm understanding correctly the new script output looks similar to this?

Code: Select all

OK-4096,166,87,87,87,87,107,87 | Total=4096MB, BUSID1=166MB, BUSID2=87MB, BUSID3=87MB
If yes, then it should work fine. But please let Nagios run this check for at least 10 - 15 minutes to start populating the graph.

There's a useful feature by the way. You can choose which values to display on the graph. For example, If you don't want to see the 'total value' you can click on its legend name and it's going to disappear from the graph. To get it back on the graph click on the legend name again.

Re: No Perf Graph Generated

Posted: Thu Jan 18, 2018 10:04 am
by lzieziula
Can you elaborate on how/why you need to RRD and XML files? Is there anyway to do that on the Nagios UI? Or only on the Nagios admin server?

Re: No Perf Graph Generated

Posted: Thu Jan 18, 2018 11:09 am
by lzieziula
@npolovenko Nevermind, I got it sorted out. Now I am having an issue as I get a "no handler for that command" seen on Nagios as an "unknown" error. Any resolution ideas?

Re: No Perf Graph Generated

Posted: Thu Jan 18, 2018 1:05 pm
by npolovenko
@lzieziula, I'd need to know how you're running the script. If you're using NSClient we do have this troubleshooting article:
https://support.nagios.com/kb/article/n ... d-627.html
If that doesn't help I'd like to see the NSClient.ini file and nsclient.log file. Do all the other checks work ok with your current configuration?