Page 1 of 1

Performance Data not being updated after migration

Posted: Wed Mar 06, 2019 1:30 pm
by vmesquita
Hello,

We migrated our Nagios install from CentOS 6 to a fresh install of CentOS 7. However a few services don't have performance data being updated anymore. Other services of the same host are working normally. Looking at the rrd folder, we realized that the ones with this issue have the xml updated, but not the RRD:

752K -rw-rw-r-- 1 nagios nagios 751K Mar 6 15:26 CheckIOQueue.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.9K Mar 6 15:26 CheckIOQueue.xml
1.9M -rw-rw-r-- 1 nagios nagios 1.9M Mar 6 15:26 CheckIOStat.rrd
8.0K -rw-rw-r-- 1 nagios nagios 4.9K Mar 6 15:26 CheckIOStat.xml
1.2M -rw-rw-r-- 1 nagios nagios 1.1M Mar 6 15:23 check_MemoryRAM.rrd
4.0K -rw-rw-r-- 1 nagios nagios 3.4K Mar 6 15:23 check_MemoryRAM.xml
2.2M -rw-rw-r-- 1 nagios nagios 2.2M Mar 6 15:25 check_net.rrd
8.0K -rw-rw-r-- 1 nagios nagios 5.2K Mar 6 15:25 check_net.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:26 check_ntp_time.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.2K Mar 6 15:26 check_ntp_time.xml
1.5M -rw-rw-r-- 1 nagios nagios 1.5M Feb 21 18:46 CPU_Stats.rrd
8.0K -rw-rw-r-- 1 nagios nagios 5.5K Mar 6 15:23 CPU_Stats.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:27 __Disk_Usage_-__cripto-pro_log.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.6K Mar 6 15:27 __Disk_Usage_-__cripto-pro_log.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:26 __Disk_Usage_-__idp-pro_log.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.5K Mar 6 15:26 __Disk_Usage_-__idp-pro_log.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:26 __Disk_Usage_-__mensageria-core-pro_log.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.7K Mar 6 15:26 __Disk_Usage_-__mensageria-core-pro_log.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:23 __Disk_Usage_-__mensageria-epm-pro_log.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.7K Mar 6 15:23 __Disk_Usage_-__mensageria-epm-pro_log.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:24 __Disk_Usage_-__portal-pro_log.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.6K Mar 6 15:24 __Disk_Usage_-__portal-pro_log.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:26 __Disk_Usage.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.2K Mar 6 15:26 __Disk_Usage.xml
752K -rw-rw-r-- 1 nagios nagios 751K Feb 21 18:49 _HOST_.rrd
4.0K -rw-rw-r-- 1 nagios nagios 4.0K Mar 6 15:26 _HOST_.xml
1.2M -rw-rw-r-- 1 nagios nagios 1.1M Mar 6 15:23 Load.rrd
4.0K -rw-rw-r-- 1 nagios nagios 3.5K Mar 6 15:23 Load.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:24 Open_Files.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.2K Mar 6 15:24 Open_Files.xml
752K -rw-rw-r-- 1 nagios nagios 751K Feb 21 18:47 Ping.rrd
8.0K -rw-rw-r-- 1 nagios nagios 4.1K Mar 6 15:27 Ping.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:25 Swap_Usage.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.1K Mar 6 15:25 Swap_Usage.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:27 Total_Processes.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.1K Mar 6 15:27 Total_Processes.xml
376K -rw-rw-r-- 1 nagios nagios 376K Mar 6 15:26 Users.rrd
4.0K -rw-rw-r-- 1 nagios nagios 2.1K Mar 6 15:26 Users.xml
752K -rw-rw-r-- 1 nagios nagios 751K Jun 8 2018 Validacao_Usuario_Extrato_Internet_-_PRO_-_URL_Status.rrd
4.0K -rw-rw-r-- 1 nagios nagios 3.3K Jun 8 2018 Validacao_Usuario_Extrato_Internet_-_PRO_-_URL_Status.xml

Any ideas?

Re: Performance Data not being updated after migration

Posted: Wed Mar 06, 2019 1:56 pm
by scottwilkerson
Did you perform a rrd migration after switching from 32 to 64 bit systems?

It's possible that this file was written to while you were performing the data migration.

I would recommend renaming the file and waiting about 10 minutes to see if data starts populating

Code: Select all

mv CPU_Stats.rrd CPU_Stats.rrd~

Re: Performance Data not being updated after migration

Posted: Wed Mar 06, 2019 2:08 pm
by vmesquita
scottwilkerson wrote:Did you perform a rrd migration after switching from 32 to 64 bit systems?
Actually we didn't migrate from 32 to 64 bits. Previous install was alreadt 64 bits (but CentOS 6)
It's possible that this file was written to while you were performing the data migration.
We didn't update the system, instead we did a full backup, a fresh install and the restored the full backup at the fresh install. Would this still be a possibility in this scenario?
I would recommend renaming the file and waiting about 10 minutes to see if data starts populating

Code: Select all

mv CPU_Stats.rrd CPU_Stats.rrd~
I could do that, but then I would loose all the previous performance data... Would there be a solution that doesn't envolve discarding the performance data?

Re: Performance Data not being updated after migration

Posted: Wed Mar 06, 2019 2:27 pm
by scottwilkerson
vmesquita wrote:We didn't update the system, instead we did a full backup, a fresh install and the restored the full backup at the fresh install. Would this still be a possibility in this scenario?
Not likely, can you show a screenshot of the advanced tab for this service?

Re: Performance Data not being updated after migration

Posted: Wed Mar 06, 2019 2:33 pm
by vmesquita
Yes
va159.PNG

Re: Performance Data not being updated after migration

Posted: Wed Mar 06, 2019 3:41 pm
by scottwilkerson
Did the plugin on this remote system change? Or did the remote system change OS versions recently?

I ask because the typical Performance data for this plugin usually just has 4 metrics, such as:

Code: Select all

user=6.42% system=1.37% iowait=3.28%;85;95 idle=88.60%
and yours has 6 metrics.

If the number of metrics change, performance data can no longer be put in the same RRD file.

What day did you do the OS migration?

Re: Performance Data not being updated after migration

Posted: Wed Mar 06, 2019 4:23 pm
by vmesquita
scottwilkerson wrote:Did the plugin on this remote system change? Or did the remote system change OS versions recently?

I ask because the typical Performance data for this plugin usually just has 4 metrics, such as:

Code: Select all

user=6.42% system=1.37% iowait=3.28%;85;95 idle=88.60%
and yours has 6 metrics.

If the number of metrics change, performance data can no longer be put in the same RRD file.

What day did you do the OS migration?
The date/time when the file was last changed is exactly the date of the backup was made for the migration. So since Nagios have been running in the new server, nothing has been recorded in the RRD.
The plugin in the remote server was not changed , at least not simultaneously with the server migration:
[a-vmesquita@va159 libexec]$ ls -ls /usr/local/nagios/libexec/check_cpu_stats.sh
16 -rwxr-xr-x 1 root root 13550 Sep 3 2015 /usr/local/nagios/libexec/check_cpu_stats.sh
However I tried to check the performance graph and realized that althought 2 paremeters were present, they were always zero...
va159-2.PNG

Re: Performance Data not being updated after migration

Posted: Wed Mar 06, 2019 5:05 pm
by npolovenko
@vmesquita, Please backup your XI server and then perform the following steps.
Rename the current RRD file CPU_Stats.rrd and allow Nagios to create another RRD file. Wait for 15 - 20 min to see if the graph starts working.
After 30 min please upload the original RRD file as well as the new one.

Also, please upload the following log files:
/usr/local/nagios/var/perfdata.log
/usr/local/nagios/var/npcd.log
PS: We can always rename the old RRD file back but this experiment should allow us to compare the structure of working and non-working RRD files.

Re: Performance Data not being updated after migration

Posted: Thu Mar 07, 2019 8:28 am
by vmesquita
Here's the requested files after the test

Re: Performance Data not being updated after migration

Posted: Thu Mar 07, 2019 12:11 pm
by npolovenko
@vmesquita, It looks like the number of data sources went up from 4 in the old RRD file to 6 in the new one. You could try the following:
Copy the old RRD and XML files for this service to some new folder on the server, like /tmp/newFolder/
Download the fix_ds_quantity.sh script from here:
https://support.nagios.com/kb/article.php?id=149
And run it like this:
./fix_ds_quantity.sh -i -d /tmp/newFolder/
If it successfully modifies the RRD/XML files you may copy them over to the original perfdata folder replacing the existing RRD/XML files.
Wait 15 minutes and let me know if the graph starts working.