NagiosXI MySQL offloading
Posted: Sun Jul 01, 2012 3:45 pm
I recently made following changes using latest manuals from asset.nagios
1) Use of Ramdisk
2) MySQL offload
3) RRDcached use for npcd/rrdtool
everything is working fine except few things I noticed...
A) few add-ons like BPA are not working as not able to connect to MySQL. here are the logs when I try to use bpa... There may be other add-ons facing this issue after offloading mysql.
E) I used iotop to see the top IO users and can see postgres continuously writing to disk. Will it be helpful to offload postgres as well?
1) Use of Ramdisk
2) MySQL offload
3) RRDcached use for npcd/rrdtool
everything is working fine except few things I noticed...
A) few add-ons like BPA are not working as not able to connect to MySQL. here are the logs when I try to use bpa... There may be other add-ons facing this issue after offloading mysql.
B) rrdtool is using rrdcached but constantly trying perl module first. is there a way to stop rrdtool checking for it.[Sun Jul 01 12:56:29 2012] [error] [client 172.24.41.233] [Sun Jul 1 12:56:29 2012] nagios-bp.cgi: Error: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2), referer: http://pnagios02lxv/nagiosxi/
[Sun Jul 01 12:56:29 2012] [error] [client 172.24.41.233] Premature end of script headers: nagios-bp.cgi, referer: http://pnagios02lxv/nagiosxi/
[Sun Jul 01 12:56:31 2012] [error] [client 172.24.41.233] [Sun Jul 1 12:56:31 2012] nagios-bp.cgi: DBI connect('nagios:localhost:3306','ndoutils',...) failed: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) at /usr/local/nagiosbp/lib/ndodb.pm line 68, referer: http://pnagios02lxv/nagiosxi/
C) Seeing some errors in syslog from rrdtool update2012-07-01 10:43:46 [15844] [2] RRDs Perl Modules are not installed. Falling back to rrdtool system call.
2012-07-01 10:43:46 [15844] [2] /usr/bin/rrdtool update --daemon=unix:/var/rrdtool/rrdcached/rrdcached.sock /usr/local/nagios/share/perfdata/UWEB248NTV/Disk_Usage.rrd 1341164613:76:50
2012-07-01 10:43:46 [15844] [1] rrdtool update returns 0
2012-07-01 10:43:46 [15844] [1] 6 lines processed
2012-07-01 10:43:46 [15844] [1] /var/nagiosramdisk/spool/perfdata//1341164616.perfdata.service-PID-15844 deleted
2012-07-01 10:43:46 [15844] [1] PNP exiting (runtime 0.03363s) ...
D) I reduced the npcd and process_perfdata logging to minimum as those were other top disk users. By default the logging is set to log level 1.Jul 1 13:21:07 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/UAPP520NTV/Disk_Usage.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/UAPP520NTV/Disk_Usage.rrd: found extra data on update argument: 21)
Jul 1 13:21:37 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USQL03NTV/CPU_Load.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USQL03NTV/CPU_Load.rrd: found extra data on update argument: 0)
Jul 1 13:21:37 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/OPSPCPMSIS01/_HOST_.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/OPSPCPMSIS01/_HOST_.rrd: expected 2 data source readings (got 1) from 1341173207)
Jul 1 13:22:23 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USACTX12NTV/CPU_Load.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USACTX12NTV/CPU_Load.rrd: found extra data on update argument: 0:0)
Jul 1 13:23:24 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USAWEB19NTV/CPU_Load.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USAWEB19NTV/CPU_Load.rrd: found extra data on update argument: 0)
Jul 1 13:23:24 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USQL07NTV/Disk_Usage.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USQL07NTV/Disk_Usage.rrd: found extra data on update argument: 18)
Jul 1 13:27:28 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USACTXW14NTV/CPU_Load.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USACTXW14NTV/CPU_Load.rrd: found extra data on update argument: 0)
Jul 1 13:31:17 pnagios02lxv nagios: Auto-save of retention data completed successfully.
Jul 1 13:34:06 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/UIMAGE5NTV/Disk_Usage.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/UIMAGE5NTV/Disk_Usage.rrd: expected 9 data source readings (got 3) from 1341173621)
Jul 1 13:36:55 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/UAPP10NTV/Disk_Usage.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/UAPP10NTV/Disk_Usage.rrd: found extra data on update argument: 25)
E) I used iotop to see the top IO users and can see postgres continuously writing to disk. Will it be helpful to offload postgres as well?