Page 1 of 1
NCPA Memory Performance Data is Broken
Posted: Tue Feb 22, 2022 4:01 pm
by nehpets
I have some Ubuntu servers that show "no data to display" for NCPA memory usage in recent performance graph history. The old data is still there if the time range is bumped out far enough. The Memory_Usage.rrd file has not been updated since the date the graph stops.
I did briefly move the rrd file to /tmp/ and restarted the npcd service and while this initially displayed a new graph (with a new "used" parameter on the graph), it was only displaying current data (historical data looked to have been lost).
How can I get these graphs back without losing all the data?
I suspect this is tied to the update from NCPA Version: 2.3.1-1 to NCPA Version: 2.4.0-1 as Ubuntu servers still running NCPA 2.3.1-1 show current and historical performance data without issue.
Re: NCPA Memory Performance Data is Broken
Posted: Wed Feb 23, 2022 1:54 pm
by pbroste
Hello
@nehpets
Thanks for reaching out, sounds like you have done most of the leg work in troubleshooting, and the performance data is plotting again.
This morning I spun up the test VM and proceeded to test rrd data merge. I utilized this script found here:
https://gist.github.com/yalinhuang/9a8d35421c1feb148bd0
Went ahead and made a back up just in case something went wonky:
Code: Select all
tar -cvzf /tmp/perfdata.tar.gz /usr/local/nagios/share/perfdata/yourhostnamehere/
For my example I used the 'memory-usage' perfdata:
Code: Select all
/usr/bin/python2.7 simple-rrd-merge.py memory-usage_current.rrd memory-usage_merging.rrd | rrdtool restore /dev/stdin output.rrd
*= I added '_current' and '_merging' to the filenames only as example.
Graph before the merge:
graph_before.png
Graph after the merge:
graph_after_merge.png
Regards,
Perry
Re: NCPA Memory Performance Data is Broken
Posted: Wed Feb 23, 2022 5:06 pm
by nehpets
Thanks for the reply, but the solution did not work for me.
Running the python script results in an error:
Code: Select all
ERROR: line 5962: expected </row> element but found <v>
Traceback (most recent call last):
File "/home/nagios/simple-rrd-merge.py", line 50, in <module>
main()
File "/home/nagios/simple-rrd-merge.py", line 46, in main
print line.rstrip()
IOError: [Errno 32] Broken pipe
python --version shows 2.7.5.
I broke this into two stages, outputting the python script to merged.xml and then calling rrdtool restore merged.xml merged.rrd separately, which resulted in the same error from the rrdtool restore operation.
Code: Select all
ERROR: line 5962: expected </row> element but found <v>
I checked the xml file at the specified line and do not see any syntax mismatch as indicated.
Again, I suspect this is all tied to the update from NCPA Version: 2.3.1-1 to NCPA Version: 2.4.0-1 since Ubuntu servers still running NCPA 2.3.1-1 show current and historical performance data without issue. After moving the memory rrd file on one of the NCPA 2.4.0-1 servers, the current data graph has an extra parameter shown.
NCPA 2.3.1-1:
server-with-NCPA-2.3.1-1.png
NCPA 2.4.0-1:
server-with-NCPA-2.4.0-1.png
Re: NCPA Memory Performance Data is Broken
Posted: Thu Feb 24, 2022 2:03 pm
by pbroste
Hello
@nehpets
Thanks for following up, want to take a look that the difference on the .rrd for NCPA Version: 2.3.1-1 versus NCPA Version: 2.4.0-1. Either send a copy of each or dump it to xml (which I would do) and compare the two .rrd's to determine the extra element added.
cd /usr/local/nagios/share/perfdata/
rrdtool dump old_2.3.1-1.rrd > old_2.3.1-1.xml
rrdtool dump new_2.3.1-1.rrd > new_2.3.1-1.xml
Then compare: diff -w old_2.3.1-1.xml new_2.3.1-1.xml
To merge the old to the new we will need to add the missing element to match so that they can successfully merge.
Thanks,
Perry
Re: NCPA Memory Performance Data is Broken
Posted: Thu Feb 24, 2022 2:48 pm
by nehpets
I attached the two files from one of the servers here. Hopefully this can be an automated process to apply to all of our Ubuntu servers.
Is stability between versions something that will be considered for future NCPA updates? I would hope that such a routine update wouldn't break data collection as it did (not to mention the repair that is needed to get the historical data back after resuming collection).
Re: NCPA Memory Performance Data is Broken
Posted: Fri Feb 25, 2022 3:22 pm
by pbroste
Hello
@nehpets
Please follow this KB article and it should update your RRD files in bulk that needs datasources added to the RRD file (what you need in this case):
https://support.nagios.com/kb/article/n ... g-149.html
Thanks,
Perry
Re: NCPA Memory Performance Data is Broken
Posted: Thu Mar 03, 2022 9:45 am
by nehpets
It took significantly more work than that article describes since I had already worked around the issue to get the collection running again. The old RRD files had to be copied out to another directory along with the XML file in order to run the parameter fix script on them and then the combination process was able to be used and the resulting file put back in the appropriate perfdata directory.
Why was this not automatically handled when your own monitoring agent added the new parameter?
Re: NCPA Memory Performance Data is Broken
Posted: Fri Mar 04, 2022 12:35 pm
by pbroste
Hello
@nehpets
We are working on an update to correct update/merge on new parameters. Here is the changelog:
https://github.com/NagiosEnterprises/nc ... HANGES.rst
I will go ahead and lock.
Thanks,
Perry