Page 1 of 2
RRDCached issues
Posted: Wed Sep 16, 2020 10:53 am
by hbouma
I am getting the following errors in our /var/log/messages related to rrdcached:
Code: Select all
Sep 16 11:49:08 XXXXXXXXXXXXXXXXXXXX rrdcached[6547]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /Disk_Usage_on__dev_mapper_vg00-var.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /Disk_Usage_on__dev_mapper_vg00-var.rrd: found extra data on update argument: 3.03:4.86)
Sep 16 11:49:09 XXXXXXXXXXXXXXXXXXXX rrdcached[6547]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /Disk_Usage_on__dev_mapper_vg00-opt.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /Disk_Usage_on__dev_mapper_vg00-opt.rrd: found extra data on update argument: 0.69:0.95)
Sep 16 11:49:10 XXXXXXXXXXXXXXXXXXXX rrdcached[6547]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /_HOST_.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /_HOST_.rrd: expected 4 data source readings (got 1) from 1600270238)
We are running Nagios XI 5.6.10 on RHEL 7 VMs. Offloaded database with rrdcached 1.4.4
Re: RRDCached issues
Posted: Thu Sep 17, 2020 9:22 am
by scottwilkerson
hbouma wrote:I am getting the following errors in our /var/log/messages related to rrdcached:
Code: Select all
Sep 16 11:49:08 XXXXXXXXXXXXXXXXXXXX rrdcached[6547]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /Disk_Usage_on__dev_mapper_vg00-var.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /Disk_Usage_on__dev_mapper_vg00-var.rrd: found extra data on update argument: 3.03:4.86)
Sep 16 11:49:09 XXXXXXXXXXXXXXXXXXXX rrdcached[6547]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /Disk_Usage_on__dev_mapper_vg00-opt.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /Disk_Usage_on__dev_mapper_vg00-opt.rrd: found extra data on update argument: 0.69:0.95)
Sep 16 11:49:10 XXXXXXXXXXXXXXXXXXXX rrdcached[6547]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /_HOST_.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX /_HOST_.rrd: expected 4 data source readings (got 1) from 1600270238)
We are running Nagios XI 5.6.10 on RHEL 7 VMs. Offloaded database with rrdcached 1.4.4
Did you change the command for these to return a different quantity of performance data that was there before? It seems you are getting a different amount of perfdata that the RRDs expect
Re: RRDCached issues
Posted: Thu Sep 17, 2020 9:30 am
by hbouma
This is being pulled from the NCPA agent checks using NCPA version 2.1.5. It is using the built in disk check functionality.
Re: RRDCached issues
Posted: Thu Sep 17, 2020 2:36 pm
by scottwilkerson
hbouma wrote:This is being pulled from the NCPA agent checks using NCPA version 2.1.5. It is using the built in disk check functionality.
Can you share the command it is using?
Also, could you show a screenshot of the advanced tab for the service as well as a pic of the performance graph for the service
Thanks
Re: RRDCached issues
Posted: Fri Sep 18, 2020 6:19 am
by hbouma
Here is an example of one of the commands:
check_ncpa.py -H HOSTNAME -t 'TOKEN' -P 5693 -M 'disk/logical/|var|log' -w 80 -c 90
2020-09-18 07_17_59-Nagios XI.png
Re: RRDCached issues
Posted: Fri Sep 18, 2020 2:55 pm
by scottwilkerson
scottwilkerson wrote:as well as a pic of the performance graph for the service
thanks
Re: RRDCached issues
Posted: Mon Sep 21, 2020 6:28 am
by hbouma
Performance graph is blank.
2020-09-21 07_27_36-Nagios XI.png
Re: RRDCached issues
Posted: Mon Sep 21, 2020 10:03 am
by scottwilkerson
Ok, so this performace graph just has one metric "used" and the current check command you are using contains 3 metrics used, free, total
At some point the command must have changed.
The only way to rectify this is to remove the rrd for this service from
Code: Select all
/usr/local/nagios/share/perfdata/XXXXXXXXXXXXXXXXXXXX/SERVICENAME.rrd
and let it get re-created with all the metrics
Re: RRDCached issues
Posted: Tue Sep 22, 2020 6:13 am
by hbouma
So, this has happened across multiple checks for multiple servers. Is it possible that something messed up the metric info on so many?
Re: RRDCached issues
Posted: Tue Sep 22, 2020 1:56 pm
by scottwilkerson
hbouma wrote:So, this has happened across multiple checks for multiple servers. Is it possible that something messed up the metric info on so many?
That does seem odd, have you made any changes across them? or commands?